SupervisedDataset#

class myoverse.datasets.paradigms.SupervisedDataset(zarr_path, split='training', inputs=('emg',), targets=('kinematics',), transform=None, target_transform=None, window_size=200, window_stride=None, n_windows=None, seed=None, device=None, dtype=torch.float32, cache_in_ram=True)[source]#

Dataset for supervised learning with inputs and targets.

Extends WindowedDataset to split modalities into inputs and targets, with separate transforms for each.

Parameters:

zarr_path (Path | str) – Path to the Zarr dataset.
split (str) – Dataset split (‘training’, ‘validation’, ‘testing’).
inputs (Sequence[str]) – Modality names to use as model inputs.
targets (Sequence[str]) – Modality names to use as model targets.
transform (Callable | None) – Transform to apply to input data (only when device is set).
target_transform (Callable | None) – Transform to apply to target data (only when device is set).
window_size (int) – Number of samples per window.
window_stride (int | None) – Stride between windows. If None, uses random positions.
n_windows (int | None) – Number of windows per epoch. Required if window_stride is None.
seed (int | None) – Random seed for reproducible window positions.
device (torch.device | str | None) – Output device (‘cpu’, ‘cuda’, or None for numpy).
dtype (torch.dtype) – Data type for tensors.
cache_in_ram (bool) – Cache entire split in RAM.

Examples

>>> # Supervised learning: EMG → kinematics
>>> ds = SupervisedDataset(
...     "data.zip",
...     inputs=["emg"],
...     targets=["kinematics"],
...     window_size=200,
...     n_windows=10000,
...     device="cuda",
... )
>>> inputs, targets = ds[0]
>>> inputs["emg"].device  # cuda:0

Methods

`__getitem__`(idx)	Load windows and split into inputs/targets.
`__init__`(zarr_path[, split, inputs, ...])

SupervisedDataset#

This Page