Skip to content

radiens_drive_catalog.catalog

catalog.py

The main interface for radiens-drive-catalog. The Catalog class wraps Drive scanning, local caching, and downloading behind a simple API.

Typical usage

from radiens_drive_catalog import Catalog, Config

config = Config.from_file("config.json") catalog = Catalog(config)

Build or refresh the catalog from Drive

catalog.scan()

Query using convenience methods or pandas directly

df = catalog.list_recordings(drive_path_prefix="2026-02-15_batch") items = catalog.list_items(is_folder=True)

Get a local path, downloading if needed

path = catalog.get_recording_path("2026-02-15_batch/reaching", "rat01_2026-02-14_probe1") path = catalog.get_item_path("2026-02-15_batch/reaching", "logs")

Unique-basename lookup

rec = catalog.get_recording("rat01_2026-02-14_probe1") path = catalog.get_recording_path(rec["drive_path"], rec["base_name"])

Download explicitly

catalog.download_recording("2026-02-15_batch/reaching", "rat01_2026-02-14_probe1") catalog.download_item("2026-02-15_batch/reaching", "logs")

Bulk prefetch (idempotent — skips items already on disk)

result = catalog.prefetch(drive_path_prefix="2026-02-15_batch")

Access the raw DataFrames

catalog.recordings_df catalog.items_df

Print a Drive tree

print(catalog.file_tree())

Classes

Catalog

Main interface for radiens-drive-catalog.

Wraps Google Drive scanning, local JSON catalog management, and file download. Querying is done directly on the recordings_df and items_df DataFrames using standard pandas operations.

Recordings are uniquely identified by (drive_path, base_name) where drive_path is the slash-joined path from the Drive root to the containing folder.

Example
from radiens_drive_catalog import Catalog, Config

config = Config.from_file("config.json")
catalog = Catalog(config)

catalog.scan()
hits = catalog.list_recordings(drive_path_prefix="2026-02")
path = catalog.get_recording_path("2026-02/reaching", "rat01")
catalog.prefetch(drive_path_prefix="2026-02")
print(catalog.file_tree())

Attributes

items_df property
items_df

The full Drive items catalog as a pandas DataFrame.

Columns: name, is_folder, drive_path, drive_id, mime_type, local_path, upload_time

recordings_df property
recordings_df

The full recording catalog as a pandas DataFrame.

Columns: base_name, drive_path, drive_file_ids, local_path, upload_time

Functions

download_item
download_item(drive_path, name)

Download a Drive item (file or folder) to the local data directory.

Items are stored under {local_data_dir}/{drive_path}/{name}.

Parameters:

Name Type Description Default
drive_path str

The slash-joined path to the item's parent folder.

required
name str

The file or folder name.

required

Returns:

Type Description
str

The local path to the downloaded file or folder.

Raises:

Type Description
EntryNotFoundError

If the item is not found in the catalog.

download_recording
download_recording(drive_path, base_name)

Download the xdat files for a recording to the local data directory.

Files are stored under {local_data_dir}/{drive_path}/.

Parameters:

Name Type Description Default
drive_path str

The slash-joined path to the folder containing the recording.

required
base_name str

The recording identifier (shared filename stem).

required

Returns:

Type Description
str

The local directory path where the files were written.

Raises:

Type Description
EntryNotFoundError

If the recording is not found in the catalog.

file_tree
file_tree()

Return a string rendering of the Drive tree with local-presence indicators.

Each entry is annotated as [recording], [folder], or [file] and [local] or [not local] based on whether it has been downloaded.

Returns:

Type Description
str

A multi-line indented string representing the Drive folder hierarchy.

get_item_path
get_item_path(drive_path, name)

Return the local path for a Drive item, downloading if needed.

Parameters:

Name Type Description Default
drive_path str

The slash-joined path to the item's parent folder.

required
name str

The file or folder name.

required

Returns:

Type Description
str

The local path to the downloaded file or folder.

Raises:

Type Description
EntryNotFoundError

If the item is not found in the catalog.

get_recording
get_recording(base_name)

Look up a recording by base_name, requiring it to be unique.

Parameters:

Name Type Description Default
base_name str

The recording's filename stem.

required

Returns:

Type Description
RecordingEntry

The matching :class:RecordingEntry.

Raises:

Type Description
EntryNotFoundError

If no recording with this base_name exists.

AmbiguousRecordingError

If more than one recording matches.

get_recording_path
get_recording_path(drive_path, base_name)

Return the local directory path for a recording, downloading if needed.

Parameters:

Name Type Description Default
drive_path str

The slash-joined path to the folder containing the recording.

required
base_name str

The recording identifier (shared filename stem).

required

Returns:

Type Description
str

The local directory path where the xdat files reside.

Raises:

Type Description
EntryNotFoundError

If the recording is not found in the catalog.

list_items
list_items(
    drive_path=None,
    drive_path_prefix=None,
    drive_path_contains=None,
    is_folder=None,
)

Query the Drive items catalog and return a filtered DataFrame.

All filters are applied together (AND semantics). Omitting all arguments returns the full items catalog.

Parameters:

Name Type Description Default
drive_path str | None

Exact drive_path match for the item's parent folder.

None
drive_path_prefix str | None

Return only rows whose drive_path starts with this string.

None
drive_path_contains str | None

Return only rows whose drive_path contains this substring.

None
is_folder bool | None

When True return only folders; when False return only files; when None (default) return both.

None
list_recordings
list_recordings(
    drive_path=None,
    drive_path_prefix=None,
    drive_path_contains=None,
)

Query the recording catalog and return a filtered DataFrame.

All filters are applied together (AND semantics). Omitting all arguments returns the full catalog.

Parameters:

Name Type Description Default
drive_path str | None

Exact drive_path match.

None
drive_path_prefix str | None

Return only rows whose drive_path starts with this string.

None
drive_path_contains str | None

Return only rows whose drive_path contains this substring.

None
prefetch
prefetch(
    drive_path=None,
    drive_path_prefix=None,
    drive_path_contains=None,
    *,
    recordings=True,
    items=True,
    is_folder=None,
    force=False,
)

Bulk-download matching recordings and items, skipping already-local entries.

Parameters:

Name Type Description Default
drive_path str | None

Exact drive_path match.

None
drive_path_prefix str | None

Match rows whose drive_path starts with this string.

None
drive_path_contains str | None

Match rows whose drive_path contains this substring.

None
recordings bool

When False, no recordings are downloaded.

True
items bool

When False, no items are downloaded.

True
is_folder bool | None

Restrict item downloads to folders (True) or files (False).

None
force bool

When True, re-download regardless of local presence.

False

Returns:

Name Type Description
A PrefetchResult

class:PrefetchResult with per-category download and skip counts.

scan
scan(*, flat=True)

Scan Drive and rebuild the catalog JSON.

Any existing local_path entries are preserved so a rescan doesn't forget which recordings or items have already been downloaded.

Parameters:

Name Type Description Default
flat bool

If True (the default), use a flat scan of all files visible to the service account. Set to False for a recursive traversal from the root folder.

True

Returns:

Name Type Description
A ScanResult

class:ScanResult with new/existing/removed counts.

PrefetchResult dataclass

Summary of a :meth:Catalog.prefetch call.

Attributes:

Name Type Description
recordings_downloaded int

Number of recordings fetched from Drive.

recordings_skipped int

Number of recordings already available locally.

items_downloaded int

Number of items fetched from Drive.

items_skipped int

Number of items already available locally.

ScanResult dataclass

Summary of a :meth:Catalog.scan call.

Attributes:

Name Type Description
recordings_new int

Recordings found on Drive not in the previous catalog.

recordings_existing int

Recordings found on Drive already in the catalog.

recordings_removed int

Recordings in the previous catalog not on Drive.

items_new int

Items found on Drive not in the previous catalog.

items_existing int

Items found on Drive already in the catalog.

items_removed int

Items in the previous catalog not found on Drive.

Functions