Performing Picture Annotation utilizing Python and OpenCV | by Wei-Meng Lee | Apr, 2023
Learn to create bounding containers to your photos
One of many frequent duties in deep studying is object detection, a course of during which you find particular objects in a given picture. An instance of object detection is detecting automobiles in a picture, the place you could possibly tally the whole variety of automobiles detected in a picture. This may be helpful in instances the place it’s good to analyze the visitors circulation at a particular junction.
So as to prepare a deep studying mannequin to detect particular objects, it’s good to provide your mannequin with a set of coaching photos, with the coordinates of the precise object within the photos all mapped out. This course of is named picture annotation. Picture annotation assigns labels to things current in a picture, with the objects all marked out.
On this article, I’ll present you how you can use Python and OpenCV to annotate your photos — you’ll use your mouse to mark out the article that you’re annotating and the applying will draw a bounding rectangle across the object. You may then see the coordinates of the article you’ve got mapped out and optionally put it aside to a log file.
First, create a textual content file and title it as bounding.py. Then, populate it with the next statements:
import argparse
import cv2ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required = True, assist = "Path to picture")
args = vars(ap.parse_args())
# load the picture
picture = cv2.imread(args["image"])
# reference to the picture
image_clone = picture
# loop till the 'q' secret is pressed
whereas True:
# show the picture
cv2.imshow("picture", picture)
# anticipate a keypress
key = cv2.waitKey(1)
if key == ord("c"):
break
# shut all open home windows
cv2.destroyAllWindows()
The above Python console software takes in an argument from the console, which is the title of the picture to show. As soon as the picture title is obtained, you’ll use OpenCV to show the picture. On the identical time, you need to clone the picture so to use it afterward. To cease this system, you’ll be able to press Ctrl-C in your keyboard.
To run this system, go to Terminal and kind within the following command:
$ python bounding.py -i Cabs.jpg
The above Cabs.jpg file might be downloaded from https://en.wikipedia.org/wiki/Taxi#/media/File:Cabs.jpg.
The picture ought to now be displayed:
We would like the consumer to have the ability to click on on the picture utilizing their mouse after which drag throughout the picture to pick out a selected area of curiosity (ROI). For this, let’s add two international variables into this system:
import argparse
import cv2# to retailer the factors for area of curiosity
roi_pt = []
# to point if the left mouse button is depressed
is_button_down = False
The next determine exhibits how roi_pt
will retailer the coordinates of the ROI:
You’ll now outline a perform title draw_rectangle()
to be the handler for mouse clicks. This perform takes in 5 arguments — occasion
, x
, y
, flags
, and param
. We’ll solely be utilizing the primary three arguments for this train:
def draw_rectangle(occasion, x, y, flags, param):
international roi_pt, is_button_downif occasion == cv2.EVENT_MOUSEMOVE and is_button_down:
international image_clone, picture
# get the unique picture to color the brand new rectangle
picture = image_clone.copy()
# draw new rectangle
cv2.rectangle(picture, roi_pt[0], (x,y), (0, 255, 0), 2)
if occasion == cv2.EVENT_LBUTTONDOWN:
# document the primary level
roi_pt = [(x, y)]
is_button_down = True
# if the left mouse button was launched
elif occasion == cv2.EVENT_LBUTTONUP:
roi_pt.append((x, y)) # append the top level
# ======================
# print the bounding field
# ======================
# in (x1,y1,x2,y2) format
print(roi_pt)
# in (x,y,w,h) format
bbox = (roi_pt[0][0],
roi_pt[0][1],
roi_pt[1][0] - roi_pt[0][0],
roi_pt[1][1] - roi_pt[0][1])
print(bbox)
# button has now been launched
is_button_down = False
# draw the bounding field
cv2.rectangle(picture, roi_pt[0], roi_pt[1], (0, 255, 0), 2)
cv2.imshow("picture", picture)
Within the above perform:
- When the left mouse button is depressed (
cv2.EVENT_LBUTTONDOWN
), you document the primary level of the ROI. You then set theis_button_down
variable toTrue
so to begin drawing a rectangle when the consumer strikes his mouse whereas miserable the left mouse button. - When the consumer strikes the mouse with the left mouse button depressed (
cv2.EVENT_MOUSEMOVE and is_button_down
), you’ll now draw a rectangle on a replica of the unique picture. You’ll want to draw on a clone picture as a result of because the consumer strikes the mouse it’s good to additionally take away the earlier rectangle that you’ve drawn earlier. So the simplest solution to accomplish that is to discard the earlier picture and use the clone picture to attract the brand new rectangle. - When the consumer lastly releases the left mouse button (
cv2.EVENT_LBUTTONUP
), you append the top level of the ROI toroi_pt
. You then print out the bounding field coordinates. For some deep studying packages, the bounding field coordinates are within the format of (x,y,width, peak), so I additionally computed the ROI coordindates on this format:
- Lastly, draw the bounding field for the ROI
To wire up the mouse occasions with its occasion handler, add within the following statements:
...# reference to the picture
image_clone = picture
# ======ADD the next======
# setup the mouse click on handler
cv2.namedWindow("picture")
cv2.setMouseCallback("picture", draw_rectangle)
# =============================
# loop till the 'q' secret is pressed
whereas True:
...
Run this system yet another time and now you can choose the ROI from the picture and a rectangle shall be displayed:
On the identical time, the coordinates of the ROI can even be displayed:
[(430, 409), (764, 656)]
(430, 409, 334, 247)
On your comfort, right here is the entire Python program:
import argparse
import cv2# to retailer the factors for area of curiosity
roi_pt = []
# to point if the left mouse button is depressed
is_button_down = False
def draw_rectangle(occasion, x, y, flags, param):
international roi_pt, is_button_down
if occasion == cv2.EVENT_MOUSEMOVE and is_button_down:
international image_clone, picture
# get the unique picture to color the brand new rectangle
picture = image_clone.copy()
# draw new rectangle
cv2.rectangle(picture, roi_pt[0], (x,y), (0, 255, 0), 2)
if occasion == cv2.EVENT_LBUTTONDOWN:
# document the primary level
roi_pt = [(x, y)]
is_button_down = True
# if the left mouse button was launched
elif occasion == cv2.EVENT_LBUTTONUP:
roi_pt.append((x, y)) # append the top level
# ======================
# print the bounding field
# ======================
# in (x1,y1,x2,y2) format
print(roi_pt)
# in (x,y,w,h) format
bbox = (roi_pt[0][0],
roi_pt[0][1],
roi_pt[1][0] - roi_pt[0][0],
roi_pt[1][1] - roi_pt[0][1])
print(bbox)
# button has now been launched
is_button_down = False
# draw the bounding field
cv2.rectangle(picture, roi_pt[0], roi_pt[1], (0, 255, 0), 2)
cv2.imshow("picture", picture)
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required = True, assist = "Path to picture")
args = vars(ap.parse_args())
# load the picture
picture = cv2.imread(args["image"])
# reference to the picture
image_clone = picture
# setup the mouse click on handler
cv2.namedWindow("picture")
cv2.setMouseCallback("picture", draw_rectangle)
# loop till the 'q' secret is pressed
whereas True:
# show the picture
cv2.imshow("picture", picture)
# anticipate a keypress
key = cv2.waitKey(1)
if key == ord("c"):
break
# shut all open home windows
cv2.destroyAllWindows()
Should you like studying my articles and that it helped your profession/examine, please think about signing up as a Medium member. It’s $5 a month, and it provides you limitless entry to all of the articles (together with mine) on Medium. Should you join utilizing the next hyperlink, I’ll earn a small fee (at no extra value to you). Your assist signifies that I will dedicate extra time on writing articles like this.
On this quick article, I demonstrated how one can annotate a picture by choosing the article in a picture. In fact, as soon as the coordinates of the article have been mapped up, it’s good to retailer it in an exterior file (comparable to a JSON or CSV file). For this, I’ll depart it as an train to the reader. Let me know if that is helpful, or what are a few of the annotation instruments you employ in your day by day work.