Multi object detection using cloud vision API

Rajathithan Rajasekar
4 min readDec 8, 2021
Photo by Eric Prouzet on Unsplash

In this post , we are going to see how to use google cloud computer vision api to detect multiple objects in a image, mark them with pointers and name those objects .

To start , first enable the cloud vision api in the google cloud console or via the gcloud command.

gcloud services enable

cloud vision api

Next, create a service account with “AI platform admin” permissions.

service account

Create a json key for this service account.

Now you can create a jupyter notebook in your local machine or you can create notebook in Google cloud AI platform . ( Please note that creating a python notebook instance in GCP will incur costs).

  • Import the required libraries
  • Resize and visualize the image
  • Normally we use cv2.rectangle to mark the identified objects, for a change we will use pointers, Since we are detecting multiple objects , have multiple rectangles on an image will make it look ugly. So it is much better to mark them with a pointer image. Read a 32X32 pointer icon image with alpha channel , separate the RGB channels and the alpha channel. Merge the alpha channels to create black back ground mask and augment the original pointer image to this black background mask.
  • Place the json key file in a hidden folder and export the “GOOGLE_APPLICATION_CREDENTIALS”
  • Export the json key file
  • Define a function detect_objects (gets the content of the image file as argument)
  1. Initialize the vision ImageAnnotatorClient
  2. Use the object_localization feature in the ImageAnnotatorClient and pass…



Rajathithan Rajasekar

I like to write code in Python . Interested in cloud , dataAnalysis, computerVision, ML and deepLearning.