Text detection (OCR) and Language identification using Google Cloud Vision API

Rajathithan Rajasekar
3 min readOct 21, 2021
Photo by Amanda Dalbjörn on Unsplash

In this post , we are going to see how to use google cloud computer vision api to detect texts in an image (Object character recognition) and identify the language type.

To start , first enable the cloud vision api in the google cloud console or via the gcloud command.

gcloud services enable vision.googleapis.com
cloud vision api

Next, create a service account with “AI platform admin” permissions.

Create a json key for this service account.

Now you can create a jupyter notebook in your local machine or you can create notebook in Google cloud AI platform . ( Please note that creating a python notebook instance in GCP will incur costs).

  • Pip install the langcodes package , langcodes implements BCP 47, the IETF Best Current Practices on tags for identifying Languages.
  • Import the required libraries
  • Visualize the images with text
  • place the json key file in a hidden folder and export the “GOOGLE_APPLICATION_CREDENTIALS”
  • Create a function detect_text. (gets the content of the image file as argument)
  1. Initialize the vision ImageAnnotatorClient
  2. Use the text_detection feature in the ImageAnnotatorClient and pass the contents of the source image and return the response.
  • Create the function plot_text_annotations ( gets the path of the image file as argument)
  1. Read the contents of the image file (Make sure you do a binary read).
  2. Pass the content to detect_text function
  3. Get the text_annotations from the response.
  4. read the image file using cv2.imread
  5. Convert the BGR2RGB channels for plotting.
  6. print the identified text and language
Rajathithan Rajasekar

I like to write code in Python . Interested in cloud , dataAnalysis, computerVision, ML and deepLearning. https://rajathithanrajasekar.medium.com/membership