EMGDataset#

class myoverse.datasets.supervised.EMGDataset(emg_data_path=PosixPath('REPLACE ME'), emg_data={}, ground_truth_data_path=PosixPath('REPLACE ME'), ground_truth_data={}, ground_truth_data_type='kinematics', sampling_frequency=0.0, tasks_to_use=(), save_path=PosixPath('REPLACE ME'), emg_filter_pipeline_before_chunking=(), emg_representations_to_filter_before_chunking=(), emg_filter_pipeline_after_chunking=(), emg_representations_to_filter_after_chunking=(), ground_truth_filter_pipeline_before_chunking=(), ground_truth_representations_to_filter_before_chunking=(), ground_truth_filter_pipeline_after_chunking=(), ground_truth_representations_to_filter_after_chunking=(), chunk_size=192, chunk_shift=64, testing_split_ratio=0.2, validation_split_ratio=0.2, augmentation_pipelines=(), amount_of_chunks_to_augment_at_once=250, debug_level=0, silence_zarr_warnings=True)[source]#

Class for creating a dataset from EMG and ground truth data.

Parameters:
  • emg_data_path (pathlib.Path) – Path to the EMG data file. It should be a pickle file containing a dictionary with the keys being the task number and the values being a numpy array of shape (n_channels, n_samples).

  • emg_data (dict[str, np.ndarray]) – Optional dictionary containing EMG data if not loading from a file

  • ground_truth_data_path (pathlib.Path) – Path to the ground truth data file. It should be a pickle file containing a dictionary with the keys being the task number and the values being a numpy array of custom shape (…, n_samples). The custom shape can be anything, but the last dimension should be the same as the EMG data.

  • ground_truth_data (dict[str, np.ndarray]) – Optional dictionary containing ground truth data if not loading from a file

  • ground_truth_data_type (str) – Type of ground truth data, e.g. ‘kinematics’

  • sampling_frequency (float) – Sampling frequency of the data in Hz

  • tasks_to_use (Sequence[str]) – Sequence of strings containing the task numbers to use. If empty, all tasks will be used.

  • save_path (pathlib.Path) – Path to save the dataset to. It should be a zarr file.

  • emg_filter_pipeline_before_chunking (list[list[FilterBaseClass]]) – Sequence of filters to apply to the EMG data before chunking.

  • emg_representations_to_filter_before_chunking (list[list[str]]) – Representations of EMG data to filter before chunking.

  • emg_filter_pipeline_after_chunking (list[list[FilterBaseClass]]) – Sequence of filters to apply to the EMG data after chunking.

  • emg_representations_to_filter_after_chunking (list[list[str]]) – Representations of EMG data to filter after chunking.

  • ground_truth_filter_pipeline_before_chunking (list[list[FilterBaseClass]]) – Sequence of filters to apply to the ground truth data before chunking.

  • ground_truth_representations_to_filter_before_chunking (list[list[str]]) – Representations of ground truth data to filter before chunking.

  • ground_truth_filter_pipeline_after_chunking (list[list[FilterBaseClass]]) – Sequence of filters to apply to the ground truth data after chunking.

  • ground_truth_representations_to_filter_after_chunking (list[list[str]]) – Representations of ground truth data to filter after chunking.

  • chunk_size (int) – Size of the chunks to create from the data.

  • chunk_shift (int) – Shift between the chunks.

  • testing_split_ratio (float) – Ratio of the data to use for testing. The data will be split in the middle. The first half will be used for training and the second half will be used for testing. If 0, no data will be used for testing.

  • validation_split_ratio (float) – Ratio of the data to use for validation. The data will be split in the middle. The first half will be used for training and the second half will be used for validation. If 0, no data will be used for validation.

  • augmentation_pipelines (list[list[EMGAugmentation]]) – Sequence of augmentation_pipelines to apply to the training data.

  • amount_of_chunks_to_augment_at_once (int) – Amount of chunks to augment at once. This is done to speed up the process.

  • debug_level (int) – Debug level: - 0: No debug output (default) - 1: Full text debugging with Rich (configuration, progress, tables, data details) - 2: Level 1 plus data visualizations (graphs and plots)

  • silence_zarr_warnings (bool) – Whether to silence all Zarr-related warnings, including those from zarr.codecs and zarr.core modules

create_dataset()[source]#

Creates the dataset.

Methods

__init__([emg_data_path, emg_data, ...])

_append_augmented_batch(training_group, ...)

Append a batch of augmented data to the training group.

_apply_augmentation_pipeline(aug_idx, ...[, ...])

Apply a single augmentation pipeline to training data in batches.

_apply_augmentations(dataset, training_group)

Apply augmentations to the training data.

_print_dataset_summary(dataset)

Print a summary of the created dataset.

_process_task(task, emg_data, ...)

Process a single task and add its data to the dataset.

create_dataset()

Create a supervised dataset from EMG and ground truth data.

create_dataset()[source]#

Create a supervised dataset from EMG and ground truth data.