Skip to main content
Welcome to the User Guide for connecting a Google Cloud Storage (GCS) bucket with Labellerr, a powerful SaaS product for data annotation and labeling. In this guide, we will walk you through the process of establishing a connection between your GCS bucket and Labellerr, leveraging IAM User Access within the same Google Cloud account. To establish the connection between your GCS bucket and Labellerr, we will utilize IAM (Identity and Access Management). IAM is a Google Cloud service that enables you to manage access to your Google Cloud resources securely. By creating an IAM user with appropriate permissions, you can grant Labellerr the necessary access to your GCS bucket while ensuring the security of your data.
Please note that the procedures outlined in this guide pertain to connecting a GCS bucket and LABELLERR within the same Google Cloud account.

Quick Reference

GCS URI Format

GCS Path Format

Format: gs://bucket-name/path/to/folder/
Example: gs://my-gcs-bucket/annotations/dataset-001/

Required Permissions Summary

  • Import (Read)
  • Export (Write)
PermissionPurpose
storage.objects.getRead files from bucket
storage.objects.listList files in bucket
storage.buckets.getGet bucket metadata
storage.buckets.updateUpdate bucket settings (CORS)

S3 vs GCS Comparison

FeatureAWS S3Google Cloud Storage
URI Formats3://bucket/path/gs://bucket/path/
AuthenticationAccess Key + Secret KeyService Account JSON
Import Permissions5 permissions4 permissions
Export Permissions+2 permissions+2 permissions
CORS SetupVia IAM policy or manualVia bucket settings
Using AWS S3 instead? See our AWS S3 Connection Guide for S3 setup instructions.
To connect data with GCS in Labellerr:

Prerequisites

1

Role Creation

Go to the IAM & Admin Console:
  • Open the Google Cloud Console
  • Navigate to IAM & Admin > Roles
Create a Custom Role:
  • Click on + CREATE ROLE
  • Enter a Title, ID, and Description for the role
  • Click CONTINUE
Create custom role in GCS
Add Permissions:
  • In the Permissions section, add the necessary permissions
For accessing a GCS bucket, you typically need:
  1. storage.buckets.get
  2. storage.buckets.update
  3. storage.objects.get
  4. storage.objects.list
  • Click CONTINUE
  • Review the role details
  • Click CREATE
GCS role permissions
2

Service Account Creation

Go to the Service Accounts Console:
  • In the Google Cloud Console, navigate to IAM & Admin > Service Accounts
GCS service accounts console
Create a Service Account:
  • Click on + CREATE SERVICE ACCOUNT
  • Enter a Service account name and Description
  • Click CREATE AND CONTINUE
Grant Access to Project:
  • In the Grant this service account access to project section, add the role created in the previous step
  • Click CONTINUE
Grant role to service account
Grant Users Access to this Service Account (Optional):
  • If needed, add users who can manage this service account
  • Click DONE
3

Generate and Download Service Account Key

Select Service Account:
  • In the Service Accounts console, find the service account you created
  • Click the ⋮ (three vertical dots) next to the service account and select Manage keys
Manage service account keys
Create Key:
  • Click ADD KEY > Create new key
Create new key
Select key type
Download key file
  • Click CREATE
  • A JSON key file will be downloaded. Keep this file secure as it contains the credentials needed to access the GCS bucket
4

Assign IAM Policy to GCS Bucket

  1. Go to the Cloud Storage Console: Navigate to Cloud Storage > Buckets in the Google Cloud Console
  2. Select Your Bucket: Click on the name of your private bucket. You will see the bucket details, including its Name, Location, and other information
  3. Open Permissions: Click on the Permissions tab
  4. Add Member:
    • Click ADD
    • In the New members field, enter the service account email
    • Select the role you created earlier
    • Click SAVE
5

Use the Labellerr tool to Access the GCS Data

Click ‘Google Cloud Storage’ from the sources list and then ‘Create New Connection’. There are two options to connect your data through ‘Private Bucket’ and ‘Public Bucket’.
Labellerr GCS connector
GCS connection form
After filling the details successfully, click on Test Connection, if the details are correct, you should see this.
GCS test connection success
Else you will get this error.
GCS test connection error
If successfully tested the connection and your files are recognized, you will need to fill the Connection name and Description(Optional).
GCS connection details
And you’re done, this is how you can connect your GCS bucket with Labellerr.

Troubleshooting

Symptom: Connection test fails with “invalid service account” error.Cause: The JSON key file is corrupted, expired, or incorrectly formatted.Solution:
  1. Go to IAM & Admin > Service Accounts in Google Cloud Console
  2. Select your service account
  3. Click Manage keys
  4. Delete the old key and create a new one
  5. Download the new JSON key file and use it for the connection
Symptom: Connection test fails with “permission denied” or “403 Forbidden” error.Cause: The service account doesn’t have the required role assigned to the bucket.Solution:
  1. Go to Cloud Storage > Buckets in Google Cloud Console
  2. Select your bucket and click Permissions tab
  3. Click ADD and enter the service account email
  4. Assign the custom role you created with the required permissions
  5. Click SAVE
Symptom: Connection test fails with “bucket not found” error.Cause: The bucket name in the path doesn’t match the actual bucket name.Solution:
  1. Verify the gs:// path matches your exact bucket name
  2. Check for typos in the bucket name
  3. Ensure the bucket exists in the same Google Cloud project
  4. Path format should be: gs://exact-bucket-name/folder/
Symptom: Dataset appears to create successfully but then shows “Failed” status.Cause: The service account doesn’t have the required permissions to access the GCS bucket.Solution:
  1. Verify all required permissions are in the custom role
  2. Ensure the role is assigned to the bucket (not just the project)
  3. Test the connection before creating datasets:
from labellerr.core.connectors import LabellerrConnection
from labellerr.core.schemas import ConnectionType, DatasetDataType

connection = LabellerrConnection(client=client, connection_id="your_connection_id")
test_result = connection.test(
    path="gs://your-bucket/path/to/data/",
    connection_type=ConnectionType._IMPORT,
    data_type=DatasetDataType.image
)
Symptom: Export fails with “No files found with the given status” error.Cause: Files have not been moved to the expected workflow stage.Solution:
  1. Ensure files have been annotated AND reviewed before exporting
  2. Check the statuses parameter in your export configuration
  3. Valid statuses for export: review, client_review, accepted