SupervisedDataset#
- class myoverse.datasets.paradigms.SupervisedDataset(zarr_path, split='training', inputs=('emg',), targets=('kinematics',), transform=None, target_transform=None, window_size=200, window_stride=None, n_windows=None, seed=None, device=None, dtype=torch.float32, cache_in_ram=True)[source]#
Dataset for supervised learning with inputs and targets.
Extends WindowedDataset to split modalities into inputs and targets, with separate transforms for each.
- Parameters:
zarr_path (Path | str) – Path to the Zarr dataset.
split (str) – Dataset split (‘training’, ‘validation’, ‘testing’).
inputs (Sequence[str]) – Modality names to use as model inputs.
targets (Sequence[str]) – Modality names to use as model targets.
transform (Callable | None) – Transform to apply to input data (only when device is set).
target_transform (Callable | None) – Transform to apply to target data (only when device is set).
window_size (int) – Number of samples per window.
window_stride (int | None) – Stride between windows. If None, uses random positions.
n_windows (int | None) – Number of windows per epoch. Required if window_stride is None.
seed (int | None) – Random seed for reproducible window positions.
device (torch.device | str | None) – Output device (‘cpu’, ‘cuda’, or None for numpy).
dtype (torch.dtype) – Data type for tensors.
cache_in_ram (bool) – Cache entire split in RAM.
Examples
>>> # Supervised learning: EMG → kinematics >>> ds = SupervisedDataset( ... "data.zip", ... inputs=["emg"], ... targets=["kinematics"], ... window_size=200, ... n_windows=10000, ... device="cuda", ... ) >>> inputs, targets = ds[0] >>> inputs["emg"].device # cuda:0
Methods
__getitem__(idx)Load windows and split into inputs/targets.
__init__(zarr_path[, split, inputs, ...])