In this post, we will see how to upload data to Google cloud firestore using asyncio and aiofiles packages for concurrent execution of code which speeds up the upload process.
GCP firestore asynclient code snippets are available in the below link.
python-docs-samples/firestore/cloud-async-client at main · GoogleCloudPlatform/python-docs-samples
You can't perform that action at this time. You signed in with another tab or window. You signed out in another tab or…
There is also nice blog post from twilio on how to use asyncio and aiofiles packages in python to work with files asynchronously.
Working with Files Asynchronously in Python using aiofiles and asyncio
Asynchronous code has become a mainstay of Python development. With asyncio becoming part of the standard library and…
For testing, I have chosen the below newline json dataset from kaggle.
Kaggle: Your Home for Data Science
Kaggle is the world's largest data science community with powerful tools and resources to help you achieve your data…
Enable the Firestore api and provision Firestore in native mode in the region of your choice.
gcloud app create
gcloud firestore databases create --region us-central
In the below code snippet , we download the data from a GCS bucket and use the downloaded newline json file for upload into Firestore.
The writes and transaction limits of firestore are given below,
since I encoutered “google.api_core.exceptions.deadlineexceeded: 504 deadline exceeded “ exceptions during upload, I have added a sleep interval of 1 second for every 500 records.
After introducing the sleep interval , I didn’t encounter RPC deadline exceeded error.
I was able to upload ~90K records in 5 mins using the asyncio package. (If any one can tweak it further to achieve better performance and time , let me know).
Thanks for reading this post, I hope it was useful to you.