In this post , we are going to see how to use google cloud computer vision api to detect multiple objects in a image, mark them with pointers and name those objects .
To start , first enable the cloud vision api in the google cloud console or via the gcloud command.
gcloud services enable vision.googleapis.com
cloud vision api
Next, create a service account with “AI platform admin” permissions.
Create a json key for this service account.
Now you can create a jupyter notebook in your local machine or you can create notebook in Google cloud AI platform . ( Please note that creating a python notebook instance in GCP will incur costs).
- Import the required libraries
- Resize and visualize the image
- Normally we use cv2.rectangle to mark the identified objects, for a change we will use pointers, Since we are detecting multiple objects , have multiple rectangles on an image will make it look ugly. So it is much better to mark them with a pointer image. Read a 32X32 pointer icon image with alpha channel , separate the RGB channels and the alpha channel. Merge the alpha channels to create black back ground mask and augment the original pointer image to this black background mask.
- Place the json key file in a hidden folder and export the “GOOGLE_APPLICATION_CREDENTIALS”
- Export the json key file
- Define a function detect_objects (gets the content of the image file as argument)
- Initialize the vision ImageAnnotatorClient
- Use the object_localization feature in the ImageAnnotatorClient and pass…