Datasets#

The datasets module provides a layered architecture for data handling:

  • Base Layer: WindowedDataset handles zarr I/O, windowing, caching

  • Paradigm Layer: SupervisedDataset for supervised learning

  • Integration Layer: DataModule for Lightning integration

  • Storage Layer: DatasetCreator and Modality for creating datasets

Storage#

DatasetCreator(modalities[, ...])

Creates datasets stored in zarr for direct tensor loading.

Modality([data, path, dims, attrs, transform])

Configuration for a data modality.

Base Dataset#

WindowedDataset(zarr_path[, split, ...])

Base dataset that loads windows from zarr for any modality.

Paradigms#

SupervisedDataset(zarr_path[, split, ...])

Dataset for supervised learning with inputs and targets.

Integration#

DataModule(data_path[, inputs, targets, ...])

Lightning DataModule for supervised learning.

Utilities#

DataSplitter([test_ratio, val_ratio])

Handles splitting data into training, testing, and validation sets.

DatasetFormatter([console, debug_level])

Handles Rich console output for dataset creation.

Presets#

Pre-configured transforms for published papers.

EMBCConfig([sampling_frequency, ...])

Configuration matching EMBC 2022 paper.

embc_train_transform([config, augmentation])

Training-time transform for EMG (EMBC paper).

embc_eval_transform([config])

Evaluation-time transform for EMG (EMBC paper, no augmentation).

embc_target_transform()

Target transform: average kinematics over window.

embc_kinematics_transform()

Pre-storage transform for kinematics (EMBC paper).