How to connect to Private Cloud-SQL from DataProc

Rajathithan Rajasekar
2 min readOct 24, 2023
Photo by Daniel Born on Unsplash

If you need to connect to private cloud sql from dataproc for your Hadoop / Spark batch jobs , then this post is for you.

When we provision a dataproc cluster , there is an option to give the initialization action .

Initialization actions are executed on each node in series during cluster creation. They are also executed on each added node when we scale the cluster nodes.

To do this follow the below steps

  • Create a custom service account
  • Add roles/cloudsql.admin IAM role to the custom service account
  • Copy the cloud-sql-proxy-dataproc.sh script to your project’s gcs bucket
  • Include that location of the script in the initialization actions.
  • Add https://www.googleapis.com/auth/sqlservice.admin to cluster VM access scopes
  • Additional scopes below can be added as required.
https://www.googleapis.com/auth/cloud-platform 
https://www.googleapis.com/auth/bigquery…

--

--

Rajathithan Rajasekar

I like to write code in Python . Interested in cloud , dataAnalysis, computerVision, ML and deepLearning. https://rajathithanrajasekar.medium.com/membership