Draw Probability of the Class Beside Bounding Box
xiii.three. Object Detection and Bounding Boxes¶ Open up the notebook in Colab
Open the notebook in Colab
Open the notebook in Colab
Open up the notebook in Colab
Open the notebook in Colab
Open the notebook in Colab
Open the notebook in SageMaker Studio Lab
In earlier sections (e.g., Section 7.1–Department seven.4), we introduced various models for image classification. In image classification tasks, we assume that at that place is only 1 major object in the image and nosotros only focus on how to recognize its category. Still, there are often multiple objects in the prototype of interest. Nosotros not only want to know their categories, but besides their specific positions in the paradigm. In reckoner vision, we refer to such tasks as object detection (or object recognition).
Object detection has been widely applied in many fields. For example, self-driving needs to plan traveling routes by detecting the positions of vehicles, pedestrians, roads, and obstacles in the captured video images. Besides, robots may use this technique to detect and localize objects of interest throughout its navigation of an environment. Moreover, security systems may demand to detect aberrant objects, such as intruders or bombs.
In the next few sections, we will introduce several deep learning methods for object detection. Nosotros will begin with an introduction to positions (or locations) of objects.
% matplotlib inline from mxnet import image , np , npx from d2l import mxnet every bit d2l npx . set_np ()
% matplotlib inline import torch from d2l import torch equally d2l
% matplotlib inline import tensorflow as tf from d2l import tensorflow as d2l
We volition load the sample paradigm to be used in this section. Nosotros can see that there is a dog on the left side of the paradigm and a cat on the correct. They are the two major objects in this image.
d2l . set_figsize () img = image . imread ( '../img/catdog.jpg' ) . asnumpy () d2l . plt . imshow ( img );
d2l . set_figsize () img = d2l . plt . imread ( '../img/catdog.jpg' ) d2l . plt . imshow ( img );
d2l . set_figsize () img = d2l . plt . imread ( '../img/catdog.jpg' ) d2l . plt . imshow ( img );
13.iii.1. Bounding Boxes¶
In object detection, we usually use a bounding box to describe the spatial location of an object. The bounding box is rectangular, which is determined by the \(10\) and \(y\) coordinates of the upper-left corner of the rectangle and the such coordinates of the lower-right corner. Some other ordinarily used bounding box representation is the \((x, y)\)-axis coordinates of the bounding box center, and the width and superlative of the box.
Here we define functions to convert between these two representations: box_corner_to_center
converts from the two-corner representation to the center-width-acme presentation, and box_center_to_corner
vice versa. The input statement boxes
should be a 2-dimensional tensor of shape (\(due north\), 4), where \(n\) is the number of bounding boxes.
#@salve def box_corner_to_center ( boxes ): """Convert from (upper-left, lower-right) to (center, width, height).""" x1 , y1 , x2 , y2 = boxes [:, 0 ], boxes [:, i ], boxes [:, 2 ], boxes [:, 3 ] cx = ( x1 + x2 ) / two cy = ( y1 + y2 ) / ii west = x2 - x1 h = y2 - y1 boxes = np . stack (( cx , cy , w , h ), axis =- 1 ) return boxes #@save def box_center_to_corner ( boxes ): """Convert from (center, width, tiptop) to (upper-left, lower-right).""" cx , cy , w , h = boxes [:, 0 ], boxes [:, ane ], boxes [:, 2 ], boxes [:, three ] x1 = cx - 0.5 * w y1 = cy - 0.5 * h x2 = cx + 0.v * w y2 = cy + 0.5 * h boxes = np . stack (( x1 , y1 , x2 , y2 ), axis =- ane ) return boxes
#@salve def box_corner_to_center ( boxes ): """Convert from (upper-left, lower-correct) to (eye, width, height).""" x1 , y1 , x2 , y2 = boxes [:, 0 ], boxes [:, ane ], boxes [:, 2 ], boxes [:, 3 ] cx = ( x1 + x2 ) / 2 cy = ( y1 + y2 ) / 2 w = x2 - x1 h = y2 - y1 boxes = torch . stack (( cx , cy , w , h ), axis =- 1 ) return boxes #@save def box_center_to_corner ( boxes ): """Convert from (center, width, elevation) to (upper-left, lower-right).""" cx , cy , w , h = boxes [:, 0 ], boxes [:, one ], boxes [:, 2 ], boxes [:, 3 ] x1 = cx - 0.5 * westward y1 = cy - 0.5 * h x2 = cx + 0.5 * westward y2 = cy + 0.5 * h boxes = torch . stack (( x1 , y1 , x2 , y2 ), axis =- 1 ) return boxes
#@save def box_corner_to_center ( boxes ): """Convert from (upper-left, lower-right) to (center, width, height).""" x1 , y1 , x2 , y2 = boxes [:, 0 ], boxes [:, one ], boxes [:, 2 ], boxes [:, 3 ] cx = ( x1 + x2 ) / 2 cy = ( y1 + y2 ) / two w = x2 - x1 h = y2 - y1 boxes = tf . stack (( cx , cy , westward , h ), axis =- 1 ) return boxes #@save def box_center_to_corner ( boxes ): """Convert from (heart, width, top) to (upper-left, lower-right).""" cx , cy , w , h = boxes [:, 0 ], boxes [:, 1 ], boxes [:, 2 ], boxes [:, 3 ] x1 = cx - 0.5 * due west y1 = cy - 0.5 * h x2 = cx + 0.5 * w y2 = cy + 0.5 * h boxes = tf . stack (( x1 , y1 , x2 , y2 ), centrality =- i ) return boxes
We volition define the bounding boxes of the dog and the cat in the image based on the coordinate information. The origin of the coordinates in the image is the upper-left corner of the prototype, and to the correct and down are the positive directions of the \(ten\) and \(y\) axes, respectively.
# Here `bbox` is the abbreviation for bounding box dog_bbox , cat_bbox = [ threescore.0 , 45.0 , 378.0 , 516.0 ], [ 400.0 , 112.0 , 655.0 , 493.0 ]
We tin can verify the correctness of the two bounding box conversion functions by converting twice.
boxes = np . assortment (( dog_bbox , cat_bbox )) box_center_to_corner ( box_corner_to_center ( boxes )) == boxes
array ([[ True , True , True , Truthful ], [ True , True , True , True ]])
boxes = torch . tensor (( dog_bbox , cat_bbox )) box_center_to_corner ( box_corner_to_center ( boxes )) == boxes
tensor ([[ True , True , True , True ], [ True , Truthful , Truthful , True ]])
boxes = tf . constant (( dog_bbox , cat_bbox )) box_center_to_corner ( box_corner_to_center ( boxes )) == boxes
< tf . Tensor : shape = ( 2 , 4 ), dtype = bool , numpy = array ([[ True , True , Truthful , True ], [ Truthful , True , True , True ]]) >
Allow us draw the bounding boxes in the prototype to check if they are authentic. Before drawing, we will define a helper function bbox_to_rect
. It represents the bounding box in the bounding box format of the matplotlib
packet.
#@save def bbox_to_rect ( bbox , color ): """Catechumen bounding box to matplotlib format.""" # Convert the bounding box (upper-left x, upper-left y, lower-right 10, # lower-correct y) format to the matplotlib format: ((upper-left x, # upper-left y), width, meridian) render d2l . plt . Rectangle ( xy = ( bbox [ 0 ], bbox [ 1 ]), width = bbox [ 2 ] - bbox [ 0 ], elevation = bbox [ 3 ] - bbox [ ane ], make full = Fake , edgecolor = colour , linewidth = 2 )
After adding the bounding boxes on the prototype, we can see that the main outline of the ii objects are basically inside the two boxes.
fig = d2l . plt . imshow ( img ) fig . axes . add_patch ( bbox_to_rect ( dog_bbox , 'blue' )) fig . axes . add_patch ( bbox_to_rect ( cat_bbox , 'reddish' ));
fig = d2l . plt . imshow ( img ) fig . axes . add_patch ( bbox_to_rect ( dog_bbox , 'blueish' )) fig . axes . add_patch ( bbox_to_rect ( cat_bbox , 'red' ));
fig = d2l . plt . imshow ( img ) fig . axes . add_patch ( bbox_to_rect ( dog_bbox , 'blue' )) fig . axes . add_patch ( bbox_to_rect ( cat_bbox , 'blood-red' ));
13.three.2. Summary¶
-
Object detection not just recognizes all the objects of interest in the prototype, merely as well their positions. The position is generally represented by a rectangular bounding box.
-
Nosotros can convert betwixt ii commonly used bounding box representations.
13.3.iii. Exercises¶
-
Find another paradigm and try to characterization a bounding box that contains the object. Compare labeling bounding boxes and categories: which usually takes longer?
-
Why is the innermost dimension of the input argument
boxes
ofbox_corner_to_center
andbox_center_to_corner
e'er iv?
Source: https://d2l.ai/chapter_computer-vision/bounding-box.html