How to connect to Private Cloud-SQL from DataProc

Rajathithan Rajasekar
2 min readOct 24, 2023
Photo by Daniel Born on Unsplash

If you need to connect to private cloud sql from dataproc for your Hadoop / Spark batch jobs , then this post is for you.

When we provision a dataproc cluster , there is an option to give the initialization action .

Initialization actions are executed on each node in series during cluster creation. They are also executed on each added node when we scale the cluster nodes.

To do this follow the below steps

  • Create a custom service account
  • Add roles/cloudsql.admin IAM role to the custom service account
  • Copy the cloud-sql-proxy-dataproc.sh script to your project’s gcs bucket
  • Include that location of the script in the initialization actions.
  • Add https://www.googleapis.com/auth/sqlservice.admin to cluster VM access scopes
  • Additional scopes below can be added as required.
https://www.googleapis.com/auth/cloud-platform 
https://www.googleapis.com/auth/bigquery
https://www.googleapis.com/auth/bigtable.admin.table
https://www.googleapis.com/auth/bigtable.data
https://www.googleapis.com/auth/cloud.useraccounts.readonly
https://www.googleapis.com/auth/devstorage.full_control
https://www.googleapis.com/auth/devstorage.read_write
https://www.googleapis.com/auth/logging.write
  • Provision the dataproc cluster with the custom service account.
gcloud dataproc clusters create clustername \
--image-version XXXXXXX \
--bucket XXXXXXX \
--region XXXXX \
--no-address \
--zone XXXXX \
--master-machine-type XXXXX \
--master-boot-disk-size 500 \
--master-boot-disk-type pd-standard \
--num-masters 1 \
--num-workers 3 \
--worker-machine-type XXXX \
--worker-boot-disk-size 500 \
--worker-boot-disk-type pd-standard \
--shielded-integrity-monitoring \
--shielded-secure-boot \
--shielded-vtpm \
--initialization-actions gs:\\xxxx\xxx.sh \
--optional-components XXXXX \
--scopes 'https://www.googleapis.com/auth/XXXX',XXXXXX,XXXXXX \…

--

--

Rajathithan Rajasekar

I like to write code in Python . Interested in cloud , dataAnalysis, computerVision, ML and deepLearning. https://rajathithanrajasekar.medium.com/membership