Text detection (OCR) and Language identification using Google Cloud Vision API
--
In this post , we are going to see how to use google cloud computer vision api to detect texts in an image (Object character recognition) and identify the language type.
To start , first enable the cloud vision api in the google cloud console or via the gcloud command.
gcloud services enable vision.googleapis.com
Next, create a service account with “AI platform admin” permissions.
Create a json key for this service account.
Now you can create a jupyter notebook in your local machine or you can create notebook in Google cloud AI platform . ( Please note that creating a python notebook instance in GCP will incur costs).
- Pip install the langcodes package , langcodes implements BCP 47, the IETF Best Current Practices on tags for identifying Languages.
- Import the required libraries
- Visualize the images with text
- place the json key file in a hidden folder and export the “GOOGLE_APPLICATION_CREDENTIALS”
- Create a function detect_text. (gets the content of the image file as argument)
- Initialize the vision ImageAnnotatorClient
- Use the text_detection feature in the ImageAnnotatorClient and pass the contents of the source image and return the response.
- Create the function plot_text_annotations ( gets the path of the image file as argument)
- Read the contents of the image file (Make sure you do a binary read).
- Pass the content to detect_text function
- Get the text_annotations from the response.
- read the image file using cv2.imread
- Convert the BGR2RGB channels for plotting.
- print the identified text and language