Skip to content

Google Cloud Storage Destination

Configure Google Cloud Storage as a destination for your data transfers in Pontoon.

Prerequisites

Before configuring Google Cloud Storage as a destination, ensure you have:

  • GCP Account
  • GCS Bucket: GCS bucket that data will be sent to
  • Service Account: GCP Service Account credentials permission to read and write to your bucket

How it works

The GCS connector writes data to your bucket as compressed Apache Parquet files using Apache Hive style partitions, which is supported by most query engines and data platforms making it easy to work with.

This connector is append-only, so re-running syncs will produce new files with a later timestamp and different batch ID (see below) but will not delete existing data in the destination.

Structure of landed data

Data written to GCS will have the following structure:

gs://<bucket_name>/<prefix>/<model>/dt=<transfer_date>/<transfer_timestamp>_<batch_id>_<file_index>.parquet

  • <bucket_name> is the name of your GCS bucket
  • <prefix> is an optional folder prefix
  • <model> is the name of the data model transferred, similar to a table name
  • <transfer_date> is the date that the transfer started in the format 2025-01-01
  • <transfer_timestamp> is a timestand of when the transfer started in the format 20250301121507
  • <batch_id> is a batch ID generated by Pontoon that is unique to the running transfer - subsequent transfers to the same destination will have different batch IDs
  • <file_index> is a monotonically increasing integer for a given batch ID

Configuration

Parameter Description Required Example
gcs_bucket_name GCS Bucket Name Yes my-bucket
gcs_bucket_path Folder path prefix Optional /exports
service_account Service account credentials Yes JSON file

Setup Destination

  1. Navigate to DestinationsNew Destination
  2. Select S3 as the destination type
  3. Enter connection details:
  4. Click Test Connection to verify
  5. Click Save to create the destination