from labellerr.client import LabellerrClientfrom labellerr.core.projects import list_projectsfrom labellerr.core.exceptions import LabellerrError# Initialize the client with your API credentialsclient = LabellerrClient( api_key='your_api_key', api_secret='your_api_secret', client_id='your_client_id')try: # List all projects - returns list of LabellerrProject objects projects = list_projects(client) print(f"Found {len(projects)} projects:") for project in projects: print(f"- Project ID: {project.project_id}") print(f" Data Type: {project.data_type}") print(f" Attached Datasets: {len(project.attached_datasets)}") print(f" Created By: {project.created_by}") print(f" Status Code: {project.status_code}")except LabellerrError as e: print(f"Failed to retrieve projects: {str(e)}")
This method is useful when you need to:
List all projects for a client
Find specific project IDs
Check project statuses and configurations
Get an overview of client’s work
Access project properties programmatically
This returns a list of LabellerrProject objects, each with access to properties like project_id, data_type, attached_datasets, created_by, and more.
from labellerr.client import LabellerrClientfrom labellerr.core.datasets import list_datasetsfrom labellerr.core.schemas import DataSetScopefrom labellerr.core.exceptions import LabellerrError# Initialize the client with your API credentialsclient = LabellerrClient( api_key='your_api_key', api_secret='your_api_secret', client_id='your_client_id')try: # Get all datasets using list_datasets with auto-pagination datasets_generator = list_datasets( client=client, datatype='image', scope=DataSetScope.client, # or DataSetScope.project page_size=-1 # Auto-paginate through all datasets ) # Iterate through the generator print("Datasets:") for dataset in datasets_generator: print(f"- Dataset ID: {dataset.get('dataset_id')}") print(f" Name: {dataset.get('name')}") print(f" Description: {dataset.get('description')}") print(f" Data Type: {dataset.get('data_type')}") print(f" Files Count: {dataset.get('files_count', 0)}")except LabellerrError as e: print(f"Failed to retrieve datasets: {str(e)}")
Pagination Support:
Returns a generator that yields individual datasets
Use page_size=-1 to automatically paginate through all datasets (recommended)
Use specific page_size to limit results (e.g., page_size=20)
Use last_dataset_id for manual pagination across multiple requests
Generator approach is memory-efficient for large numbers of datasets
Available Scope Options:
DataSetScope.client - Datasets with client-level permissions
DataSetScope.project - Datasets with project-level permissions
This method is useful when you need to:
Get a comprehensive list of all datasets in your workspace
Filter datasets by specific data types (image, video, audio, document, text)
Organize datasets by scope (client-level or project-level permissions)
Efficiently iterate through large numbers of datasets with pagination
Memory-efficient processing of datasets using generators
from labellerr.core.datasets import list_datasetsfrom labellerr.core.schemas import DataSetScope# Auto-paginate through all datasets (memory-efficient)all_datasets = list_datasets( client=client, datatype='image', scope=DataSetScope.client, page_size=-1 # Automatically handles all pages)# Process datasets as they're retrievedfor dataset in all_datasets: print(f"Processing dataset: {dataset.get('name')}") # Do something with each dataset
from labellerr.client import LabellerrClientfrom labellerr.core.projects import LabellerrProjectfrom labellerr.core.exceptions import LabellerrErrorclient = LabellerrClient( api_key='your_api_key', api_secret='your_api_secret', client_id='your_client_id')try: project = LabellerrProject(client=client, project_id="your_project_id") # Detach a single dataset result = project.detach_dataset_from_project(dataset_id="dataset_id_to_detach") # Or detach multiple datasets at once # result = project.detach_dataset_from_project(dataset_ids=["dataset_1", "dataset_2"]) print(f"Dataset(s) detached successfully: {result}")except LabellerrError as e: print(f"Failed to detach dataset: {str(e)}")
Important: When detaching datasets from a project, the method no longer requires client_id and project_id parameters as these are automatically derived from the project instance.
Bulk assign is essential for automating annotation workflows:
Common Use Cases
Use Case
Example
Batch Assignment
Assign 100 images to annotator at once instead of clicking 100 times
Status Progression
Move all “annotation” files to “review” status after completion
Team Management
Distribute files across multiple team members efficiently
Workflow Automation
Automate the flow: annotation → review → client_review → accepted
Onboarding
Assign initial set of files to new annotators
Quality Control
Send all files to a senior reviewer before client submission
Example Scenario:
Imagine you have 500 images that need annotation. After annotators finish, you want to send them all to review status. Without bulk assign, you’d click 500 times. With bulk_assign_files(), it’s one API call:
Copy
# After annotation is completeresult = project.bulk_assign_files( file_ids=all_500_file_ids, new_status="review", assign_to="[email protected]")
The Labellerr SDK uses a custom exception class, LabellerrError, to indicate issues during API interactions. Always wrap your function calls in try-except blocks to gracefully handle errors.
from labellerr.core.projects import list_projectsdef list_projects(client: LabellerrClient) -> list[LabellerrProject]: """ Retrieves a list of projects associated with a client ID. Parameters: client: LabellerrClient instance Returns: List of LabellerrProject objects with properties: - project_id: str - data_type: str - attached_datasets: list - created_by: str - created_at: datetime - annotation_template_id: str - status_code: int """
Example:
Copy
projects = list_projects(client)for project in projects: print(f"{project.project_id}: {project.data_type}")
from labellerr.core.datasets import list_datasetsfrom labellerr.core.schemas import DataSetScopedef list_datasets( client: LabellerrClient, datatype: str, scope: DataSetScope, page_size: int = 10, last_dataset_id: str = None) -> Generator: """ Retrieves datasets by parameters with pagination support. Parameters: client: LabellerrClient instance datatype: Type of data ('image', 'video', 'audio', 'document', 'text') scope: DataSetScope.client or DataSetScope.project page_size: Number of datasets per page (default: 10, -1 for auto-pagination) last_dataset_id: ID of last dataset from previous page (for manual pagination) Returns: Generator yielding dataset dictionaries with keys: - dataset_id: str - name: str - description: str - data_type: str - files_count: int - created_at: datetime - created_by: str """
Examples:
Copy
# Auto-paginate through allfor dataset in list_datasets(client, 'image', DataSetScope.client, page_size=-1): print(dataset['name'])# Get first 20 onlydatasets = list(list_datasets(client, 'video', DataSetScope.client, page_size=20))
LabellerrDataset.fetch_files()
Copy
from labellerr.core.datasets import LabellerrDatasetdataset = LabellerrDataset(client=client, dataset_id="dataset_id")files = dataset.fetch_files()
Returns:List of file dictionaries with metadata including:
file_id: str
file_name: str
status: str
file_type: str
file_size: int
LabellerrProject - Properties
Copy
from labellerr.core.projects import LabellerrProjectproject = LabellerrProject(client=client, project_id="project_id")# Available properties:project.project_id # str: Project identifierproject.data_type # str: Data type (image, video, etc.)project.attached_datasets # list: List of dataset IDsproject.created_by # str: Creator emailproject.created_at # datetime: Creation timestampproject.annotation_template_id # str: Template IDproject.status_code # int: Project status code
Set page_size=-1 for list_datasets() to automatically handle all pages without manual intervention
Leverage Generators
Process datasets as they’re retrieved instead of loading all into memory at once
Filter by Scope
Use DataSetScope.client for workspace-level datasets or DataSetScope.project for project-specific ones
Error Handling
Always wrap API calls in try-except blocks to catch LabellerrError exceptions
The Labellerr SDK is a fast and reliable solution for managing annotation workflows. Want to try it end-to-end? Refer to this Google Colab Cookbook for a ready-to-run tutorial. For more related cookbooks and examples, please visit our repository: Labellerr Hands-On Learning