Google Cloud Storage Destination¶
Configure Google Cloud Storage as a destination for your data transfers in Pontoon.
Prerequisites¶
Before configuring Google Cloud Storage as a destination, ensure you have:
- GCP Account
- GCS Bucket: GCS bucket that data will be sent to
- Service Account: GCP Service Account credentials permission to read and write to your bucket
How it works¶
The GCS connector writes data to your bucket as compressed Apache Parquet files using Apache Hive style partitions, which is supported by most query engines and data platforms making it easy to work with.
This connector is append-only, so re-running syncs will produce new files with a later timestamp and different batch ID (see below) but will not delete existing data in the destination.
Structure of landed data¶
Data written to GCS will have the following structure:
gs://<bucket_name>/<prefix>/<model>/dt=<transfer_date>/<transfer_timestamp>_<batch_id>_<file_index>.parquet
<bucket_name>
is the name of your GCS bucket<prefix>
is an optional folder prefix<model>
is the name of the data model transferred, similar to a table name<transfer_date>
is the date that the transfer started in the format2025-01-01
<transfer_timestamp>
is a timestand of when the transfer started in the format20250301121507
<batch_id>
is a batch ID generated by Pontoon that is unique to the running transfer - subsequent transfers to the same destination will have different batch IDs<file_index>
is a monotonically increasing integer for a given batch ID
Configuration¶
Parameter | Description | Required | Example |
---|---|---|---|
gcs_bucket_name |
GCS Bucket Name | Yes | my-bucket |
gcs_bucket_path |
Folder path prefix | Optional | /exports |
service_account |
Service account credentials | Yes | JSON file |
Setup Destination¶
- Navigate to Destinations → New Destination
- Select S3 as the destination type
- Enter connection details:
- Click Test Connection to verify
- Click Save to create the destination