How to send data to BQ & GCS from On-prem via Hybrid connection
By default, when you send data from On-Prem hosts to a cloud storage bucket or a BigQuery dataset , they send data over the internet to these public api endpoints —[ https://bigquery.googleapis.com/ , https://storage.googleapis.com] and will not use your Hybrid connectivity between GCP & On-Prem to send the data.
In order to send data via the hybrid connection, there are 2 different options available , private google access for on-prem hosts and private services connect for on-prem hosts. Let’s see those 2 options.
In private google access, you have two special domains.
“ private.googleapis.com(220.127.116.11/30) and restricted.googleapis.com (18.104.22.168/30)”
Use Private.googleapis.com when you need access to most google apis from on-prem and use restricted.googleapis.com when your on-prem hosts need access to apis supported by VPC service controls and not to any other apis. You can get the detailed information from the below link.
Configuring Private Google Access for on-premises hosts | VPC | Google Cloud
Private Google Access for on-premises hosts provides a way for on-premises systems to connect to Google APIs and…
In private google access, you need to configure the below
- DNS — Cloud DNS managed private zone and use a Cloud DNS inbound server policy or Configure on-prem DNS server to resolved to the set of ips addresses for private.googleapis.com or restricted.googleapis.com
- Routes — Create routes to private.googleapis.com (22.214.171.124/30) or restricted.googleapis.com(126.96.36.199/30) with the next hop to defaulte internet gateway.
- Routes advertisements — Advertise the custom routes for 188.8.131.52/30 or 184.108.40.206/30 in the cloud routers that is used for hybrid connectivity.
- Allow the necessary firewall rules for In-Bound/outbound connectivity from on-prem to GCP and vice-versa.
This is fairly simple to configure when compared to private services connect.
So when to use private services connect over private google access ?
If you have specific requirement to send data to googleapis via hybrid connection only to certain projects in your org and not for all the projects, you can use private service connect. This is especially useful when you have limited network bandwidth between GCP & On-Prem. The data which can be sent over the internet can still take the public api endpoint and the data which needs to be sent over a private network can take the private services connect endpoint.
Access Google APIS using Private Service Connect | VPC | Google Cloud
Private Service Connect lets you connect to service producers using endpoints with internal IP addresses in your VPC…
Steps to create private service connect endpoint:
- Reserve a static internal ip4 address and not used within the range of subnets configured in VPC network. the requirements are given in the above google cloud link.
gcloud beta compute addresses create psc-ip \
- create a forwarding rule
gcloud beta compute forwarding-rules create pscendpoint \
Create Cloud DNS managed private zone and use a Cloud DNS inbound server policy
- create a private DNS zone.
gcloud dns --project=<projectname> managed-zones create psc-dns-zone --description="" --dns-name="p.googleapis.com." --visibility="private" --networks=<yournetwork>
- create a DNS A record.
gcloud dns --project=<projectname> record-sets transaction start --zone=psc-dns-zone
gcloud dns --project=<projectname> record-sets transaction add <pscendpointip> --name=storage-pscendpoint.p.googleapis.com. --ttl=300 --type=A --zone=psc-dns-zone
gcloud dns --project=<projectname> record-sets transaction execute --zone=psc-dns-zone
- Advertise the <pscendpointip> under the custom routes section in cloud routers used for Hybrid connection.
- Open the necessary firewall rules between GCP & On-Prem
- Create a boto configuration file in your on-prem host.
gs_host = storage-pscendpoint.p.googleapis.com
gs_host_header = storage.googleapis.com
gs_json_host = storage-pscendpoint.p.googleapis.com
gs_json_host_header = www.googleapis.com[GSUtil]
default_project_id = XXXXXXXXXXXXXX
default_api_version = 2[Boto]
https_validate_certificates = False
In order to verify that the data goes to private services connect endpoint, take a tcpdump and verify the packet capture in wireshark. If you see your private service connect endpoint in the packet capture, you have successfully configured data transfer via Hybrid connection.
Thanks for reading this post, I hope it was useful.