radiens_drive_catalog
radiens-drive-catalog
Programmatic catalog and sync tool for xdat neural datasets stored on Google Drive. Datasets are uniquely identified by (drive_path, base_name) and queryable by drive path exact match, prefix, or substring.
Classes
AssetEntry
Bases: TypedDict
One non-xdat asset record stored in the catalog.
Represents a single file or folder found on Drive that is not an xdat
dataset — for example a logs/ directory, a PowerPoint, or a writeup.
Attributes:
| Name | Type | Description |
|---|---|---|
asset_name |
str
|
The file or folder name as it appears on Drive
(e.g. |
asset_type |
Literal['folder', 'file']
|
|
drive_path |
str
|
Slash-joined path from the root folder to the parent
folder of this asset (e.g. |
drive_id |
str
|
Google Drive ID of this file or folder. |
mime_type |
str
|
MIME type as reported by the Drive API. |
local_path |
str | None
|
Absolute local path to the downloaded file or folder, or
|
Catalog
Main interface for radiens-drive-catalog.
Wraps Google Drive scanning, local JSON catalog management, and file
download. Querying is done directly on the df and assets_df
DataFrames using standard pandas operations.
Recordings are uniquely identified by (drive_path, base_name) where
drive_path is the slash-joined path from the Drive root to the
containing folder.
Example
Attributes
assets_df
property
The full assets catalog as a pandas DataFrame. Columns: asset_name, asset_type, drive_path, drive_id, mime_type, local_path
df
property
The full dataset catalog as a pandas DataFrame. Columns: base_name, drive_path, drive_file_ids, local_path
Functions
__init__
Initialize a Catalog.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
Config | None
|
Package configuration. When omitted, |
None
|
quiet
|
bool
|
If |
False
|
download_asset
Download an asset (file or folder) to the local assets directory.
Assets are stored under {local_data_dir}/assets/{drive_path}/{asset_name}.
For folder assets the entire subtree is downloaded recursively.
After a successful download, persists the local_path back to the
catalog JSON.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
drive_path
|
str
|
The slash-joined path to the asset's parent folder
(e.g. |
required |
asset_name
|
str
|
The file or folder name (e.g. |
required |
Returns:
| Type | Description |
|---|---|
str
|
The local path to the downloaded file or folder. |
Raises:
| Type | Description |
|---|---|
EntryNotFoundError
|
If the asset is not found in the catalog. |
download_dataset
Download the xdat files for a dataset to the local data directory.
Files are stored under {local_data_dir}/{drive_path}/, mirroring
the Drive folder hierarchy. After a successful download, persists the
local_path back to the catalog JSON.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
drive_path
|
str
|
The slash-joined path to the folder containing the
dataset (as shown in |
required |
base_name
|
str
|
The dataset identifier (shared filename stem). |
required |
Returns:
| Type | Description |
|---|---|
str
|
The local directory path where the files were written. |
Raises:
| Type | Description |
|---|---|
EntryNotFoundError
|
If the dataset is not found in the catalog. |
get_asset_path
Return the local path for an asset, downloading if needed.
If the asset has not been downloaded yet, or if the recorded
local_path no longer exists on disk, the download is triggered
automatically.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
drive_path
|
str
|
The slash-joined path to the asset's parent folder
(e.g. |
required |
asset_name
|
str
|
The file or folder name (e.g. |
required |
Returns:
| Type | Description |
|---|---|
str
|
The local path to the downloaded file or folder. |
Raises:
| Type | Description |
|---|---|
EntryNotFoundError
|
If the asset is not found in the catalog. |
get_dataset_path
Return the local directory path for a dataset, downloading if needed.
If the dataset has not been downloaded yet, or if the recorded
local_path no longer exists on disk, the download is triggered
automatically.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
drive_path
|
str
|
The slash-joined path to the folder containing the
dataset (as shown in |
required |
base_name
|
str
|
The dataset identifier (shared filename stem). |
required |
Returns:
| Type | Description |
|---|---|
str
|
The local directory path where the xdat files reside. |
Raises:
| Type | Description |
|---|---|
EntryNotFoundError
|
If the dataset is not found in the catalog. |
list_assets
Query the asset catalog and return a filtered DataFrame.
All filters are applied together (AND semantics). Omitting all arguments returns the full asset catalog.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
drive_path
|
str | None
|
Exact |
None
|
drive_path_prefix
|
str | None
|
Return only rows whose |
None
|
drive_path_contains
|
str | None
|
Return only rows whose |
None
|
asset_type
|
str | None
|
Filter by |
None
|
Examples:
catalog.list_assets() # everything catalog.list_assets(drive_path="2026-02-15_batch/reaching") # exact parent folder catalog.list_assets(drive_path_prefix="2026-02-15_batch") # subtree catalog.list_assets(asset_type="folder") # folders only
list_datasets
Query the dataset catalog and return a filtered DataFrame.
All filters are applied together (AND semantics). Omitting all arguments returns the full catalog.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
drive_path
|
str | None
|
Exact |
None
|
drive_path_prefix
|
str | None
|
Return only rows whose |
None
|
drive_path_contains
|
str | None
|
Return only rows whose |
None
|
Examples:
catalog.list_datasets() # everything catalog.list_datasets(drive_path="2026-02-15_batch/reaching") # exact folder catalog.list_datasets(drive_path_prefix="2026-02-15_batch") # subtree catalog.list_datasets(drive_path_contains="reaching") # any depth
prefetch
prefetch(
drive_path=None,
drive_path_prefix=None,
drive_path_contains=None,
*,
datasets=True,
assets=True,
asset_type=None,
force=False,
)
Bulk-download matching datasets and assets, skipping already-local items.
Idempotent by default: entries whose recorded local_path already
exists on disk are skipped. Running twice in a row issues no downloads
on the second call. This makes prefetch safe to use both for
proactive ("pre-warm before going offline") and reactive ("ensure
everything under X is available") workflows.
The drive_path filters have the same semantics as
:meth:list_datasets / :meth:list_assets: AND-combined, any omitted
filter is a wildcard.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
drive_path
|
str | None
|
Exact |
None
|
drive_path_prefix
|
str | None
|
Match rows whose |
None
|
drive_path_contains
|
str | None
|
Match rows whose |
None
|
datasets
|
bool
|
When |
True
|
assets
|
bool
|
When |
True
|
asset_type
|
str | None
|
Restrict to |
None
|
force
|
bool
|
When |
False
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
PrefetchResult
|
class: |
Examples:
catalog.prefetch() # everything catalog.prefetch(drive_path_prefix="2026-02") # subtree catalog.prefetch(drive_path_prefix="2026-02", assets=False) # datasets only catalog.prefetch(drive_path_prefix="2026-02", asset_type="folder") catalog.prefetch(force=True) # re-download everything catalog.prefetch(drive_path_prefix="2026-02", force=True) # re-download a subtree
scan
Scan Drive and rebuild the catalog JSON.
Any existing local_path entries are preserved so a rescan doesn't forget which datasets or assets have already been downloaded.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
flat
|
bool
|
If |
True
|
Returns:
| Name | Type | Description |
|---|---|---|
A |
ScanResult
|
class: |
ScanResult
|
datasets and assets. |
CatalogError
Bases: Exception
Base class for all radiens-drive-catalog errors.
Config
dataclass
Configuration for radiens-drive-catalog.
All path fields (credentials_path, local_data_dir, catalog_path) support
~ and $ENV_VAR expansion and are resolved to absolute paths on construction.
Typically created via Config.from_file() rather than directly.
Attributes:
| Name | Type | Description |
|---|---|---|
credentials_path |
str
|
Path to the Google service account credentials JSON file. |
root_folder_id |
str
|
Google Drive folder ID of the data root folder. |
local_data_dir |
str
|
Local directory where datasets will be downloaded. |
catalog_path |
str
|
Path to the catalog JSON file (created by |
Functions
__post_init__
Expand ~ and $ENV_VARS in all path fields and resolve to absolute paths.
from_file
classmethod
Load config from a JSON file.
Resolution order when path is None:
RADIENS_DRIVE_CATALOG_CONFIGenvironment variable..secrets/config.jsonin the current working directory.config.jsonin the current working directory.~/.config/radiens-drive/config.jsonin the user's home directory./etc/radiens-drive/config.json.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | None
|
Path to the config JSON file. When |
None
|
Returns:
| Type | Description |
|---|---|
Config
|
A |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If no config file can be located. |
JSONDecodeError
|
If the config file is not valid JSON. |
TypeError
|
If the JSON fields do not match the |
DatasetEntry
Bases: TypedDict
One xdat dataset record stored in the catalog.
Represents a single neural recording dataset. The three xdat files
(_data.xdat, .xdat.json, _timestamp.xdat) share a common
base_name stem.
Attributes:
| Name | Type | Description |
|---|---|---|
base_name |
str
|
Shared filename stem across all three xdat files.
Note: |
drive_path |
str
|
Slash-joined path from the root folder to the folder
containing the dataset files (e.g.
|
drive_file_ids |
dict[str, str]
|
Maps file type labels ( |
local_path |
str | None
|
Absolute path to the local directory where the dataset has
been downloaded, or |
EntryNotFoundError
Bases: CatalogError
Raised when a dataset or asset cannot be found in the catalog.
PrefetchResult
dataclass
Summary of a :meth:Catalog.prefetch call.
Attributes:
| Name | Type | Description |
|---|---|---|
datasets_downloaded |
int
|
Number of datasets fetched from Drive. |
datasets_skipped |
int
|
Number of datasets already available locally. |
assets_downloaded |
int
|
Number of assets fetched from Drive. |
assets_skipped |
int
|
Number of assets already available locally. |
ScanResult
dataclass
Summary of a :meth:Catalog.scan call.
Attributes:
| Name | Type | Description |
|---|---|---|
datasets_new |
int
|
Datasets found on Drive that were not in the previous catalog. |
datasets_existing |
int
|
Datasets found on Drive that were already in the catalog. |
datasets_removed |
int
|
Datasets in the previous catalog that were not found on Drive. |
assets_new |
int
|
Assets found on Drive that were not in the previous catalog. |
assets_existing |
int
|
Assets found on Drive that were already in the catalog. |
assets_removed |
int
|
Assets in the previous catalog that were not found on Drive. |