Datasets¶
Loader¶
- class myoverse.datasets.loader.EMGDatasetLoader(data_path, seed=None, dataloader_parameters=None, shuffle_training_data=True, input_type=<class 'numpy.float32'>, ground_truth_type=<class 'numpy.float32'>, ground_truth_name='ground_truth', input_augmentation_pipeline=[[IdentityFilter (IdentityFilter)]], input_augmentation_probabilities=(1,), ground_truth_augmentation_pipeline=[[IndexDataFilter (IndexDataFilter)]], ground_truth_augmentation_probabilities=(1,))[source]¶
Dataset loader for the EMG dataset.
- Parameters:
data_path (Path) – The path to the zarr file
seed (Optional[int], optional) – The seed for the random number generator, by default None
dataloader_parameters (Dict[str, Any], optional) – The parameters for the DataLoader, by default None
shuffle_training_data (bool, optional) – Whether to shuffle the training data, by default True
input_type (numpy.dtype, optional) – The type of the input data, by default np.float32
ground_truth_type (numpy.dtype, optional) – The type of the ground_truth data, by default np.float32
ground_truth_name (str, optional) – The name of the ground truth data, by default “ground_truth”
input_augmentation_pipeline (list[list[FilterBaseClass]], optional) – The augmentation pipeline for the input data, by default [[IdentityFilter(is_output=True)]]
input_augmentation_probabilities (Sequence[float], optional) – The probabilities for the augmentation pipeline, by default (1,) The sum of the probabilities must be equal to 1 and the number of probabilities must be equal to the number of augmentation sequences.
ground_truth_augmentation_pipeline (list[list[FilterBaseClass]], optional) – The augmentation pipeline for the ground truth data, by default [[IdentityFilter(is_output=True)]]
ground_truth_augmentation_probabilities (Sequence[float], optional) – The probabilities for the augmentation pipeline, by default (1,) The sum of the probabilities must be equal to 1 and the number of probabilities must be equal to the number of augmentation sequences.
Initializes the dataset.
- data_path¶
The path to the HDF5 file
- Type:
Path
- dataloader_parameters¶
The parameters for the DataLoader, by default None
- Type:
Dict[str, Any], optional
- input_type¶
The type of the input data, by default np.float32
- Type:
np.dtype, optional
- ground_truth_type¶
The type of the label data, by default np.float32
- Type:
np.dtype, optional
- ground_truth_name¶
The name of the ground truth data, by default “ground_truth”
- Type:
bool, optional
- input_augmentation_pipeline¶
The augmentation pipeline for the input data, by default [[IdentityFilter(is_output=True)]]
- Type:
list[list[FilterBaseClass]], optional
- input_augmentation_probabilities¶
The probabilities for the augmentation pipeline, by default (1,) The sum of the probabilities must be equal to 1 and the number of probabilities must be equal to the number of augmentation sequences.
- Type:
Sequence[float], optional
- ground_truth_augmentation_pipeline¶
The augmentation pipeline for the ground truth data, by default [[IdentityFilter(is_output=True)]]
- Type:
list[list[FilterBaseClass]], optional
- ground_truth_augmentation_probabilities¶
The probabilities for the augmentation pipeline, by default (1,) The sum of the probabilities must be equal to 1 and the number of probabilities must be equal to the number of augmentation sequences.
- Type:
Sequence[float], optional
- test_dataloader()[source]¶
Returns the testing set as a DataLoader.
- Returns:
The testing set
- Return type:
DataLoader
Supervised Dataset¶
- class myoverse.datasets.supervised.EMGDataset(emg_data_path=PosixPath('REPLACE ME'), emg_data={}, ground_truth_data_path=PosixPath('REPLACE ME'), ground_truth_data={}, ground_truth_data_type='kinematics', sampling_frequency=0.0, tasks_to_use=(), save_path=PosixPath('REPLACE ME'), emg_filter_pipeline_before_chunking=(), emg_representations_to_filter_before_chunking=(), emg_filter_pipeline_after_chunking=(), emg_representations_to_filter_after_chunking=(), ground_truth_filter_pipeline_before_chunking=(), ground_truth_representations_to_filter_before_chunking=(), ground_truth_filter_pipeline_after_chunking=(), ground_truth_representations_to_filter_after_chunking=(), chunk_size=192, chunk_shift=64, testing_split_ratio=0.2, validation_split_ratio=0.2, augmentation_pipelines=(), amount_of_chunks_to_augment_at_once=250, debug=False)[source]¶
Class for creating a dataset from EMG and ground truth data.
- Parameters:
emg_data_path (pathlib.Path) – Path to the EMG data file. It should be a pickle file containing a dictionary with the keys being the task number and the values being a numpy array of shape (n_channels, n_samples).
ground_truth_data_path (pathlib.Path) – Path to the ground truth data file. It should be a pickle file containing a dictionary with the keys being the task number and the values being a numpy array of custom shape (…, n_samples). The custom shape can be anything, but the last dimension should be the same as the EMG data.
tasks_to_use (Sequence[str]) – Sequence of strings containing the task numbers to use. If empty, all tasks will be used.
save_path (pathlib.Path) – Path to save the dataset to. It should be a zarr file.
emg_filter_pipeline_before_chunking (list[list[FilterBaseClass]]) – Sequence of filters to apply to the EMG data before chunking. The filters should inherit from FilterBaseClass.
emg_filter_pipeline_after_chunking (list[list[FilterBaseClass]]) – Sequence of filters to apply to the EMG data after chunking. The filters should inherit from FilterBaseClass.
ground_truth_filter_pipeline_before_chunking (list[list[FilterBaseClass]]) – Sequence of filters to apply to the ground truth data before chunking. The filters should inherit from FilterBaseClass.
ground_truth_filter_pipeline_after_chunking (list[list[FilterBaseClass]]) – Sequence of filters to apply to the ground truth data after chunking. The filters should inherit from FilterBaseClass.
chunk_size (int) – Size of the chunks to create from the data.
chunk_shift (int) – Shift between the chunks.
testing_split_ratio (float) – Ratio of the data to use for testing. The data will be split in the middle. The first half will be used for training and the second half will be used for testing. If 0, no data will be used for testing.
validation_split_ratio (float) – Ratio of the data to use for validation. The data will be split in the middle. The first half will be used for training and the second half will be used for validation. If 0, no data will be used for validation.
augmentation_pipelines (list[list[EMGAugmentation]]) – Sequence of augmentation_pipelines to apply to the training data. The augmentation_pipelines should inherit from EMGAugmentation.
amount_of_chunks_to_augment_at_once (int) – Amount of chunks to augment at once. This is done to speed up the process.
ground_truth_data_type (str)
sampling_frequency (float)
ground_truth_representations_to_filter_before_chunking (list[str])
ground_truth_representations_to_filter_after_chunking (list[str])
debug (bool)
Default Supervised Datasets¶
- class myoverse.datasets.defaults.CastelliniDataset(emg_data_path, ground_truth_data_path, save_path, tasks_to_use=('Change Me',))[source]¶
Dataset maker made after the Castellini paper [1]. This is not the official dataset maker used but our own version made after the paper.
- Parameters:
emg_data_path (Path) – The path to the pickle file containing the EMG data. This should be a dictionary with the keys as the tasks in tasks_to_use and the values as the EMG data. The EMG data should be of shape (320, samples).
ground_truth_data_path (Path) – The path to the pickle file containing the ground truth data. This should be a dictionary with the keys as the tasks in tasks_to_use and the values as the ground truth data. The ground truth data should be of shape (21, 3, samples).
save_path (Path) – The path to save the dataset to. This should be a zarr file.
References
[1] Nowak, M., Vujaklija, I., Sturma, A., Castellini, C., Farina, D., 2023. Simultaneous and Proportional Real-Time Myocontrol of Up to Three Degrees of Freedom of the Wrist and Hand. IEEE Transactions on Biomedical Engineering 70, 459–469. https://doi.org/10/grc7qf
- class myoverse.datasets.defaults.EMBCDataset(emg_data_path, ground_truth_data_path, save_path, tasks_to_use=('Change Me',), debug=False)[source]¶
Official dataset maker for the EMBC paper [1].
- Parameters:
emg_data_path (Path) – The path to the pickle file containing the EMG data. This should be a dictionary with the keys as the tasks in tasks_to_use and the values as the EMG data. The EMG data should be of shape (320, samples).
ground_truth_data_path (Path) – The path to the pickle file containing the ground truth data. This should be a dictionary with the keys as the tasks in tasks_to_use and the values as the ground truth data. The ground truth data should be of shape (21, 3, samples).
save_path (Path) – The path to save the dataset to. This should be a zarr file.
tasks_to_use (Sequence[str], optional) – The tasks to use. The default is EXPERIMENTS_TO_USE.
debug (bool)
References
[1] Sîmpetru, R.C., Osswald, M., Braun, D.I., Souza de Oliveira, D., Cakici, A.L., Del Vecchio, A., 2022. Accurate Continuous Prediction of 14 Degrees of Freedom of the Hand from Myoelectrical Signals through Convolutive Deep Learning, in: Proceedings of the 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC) pp. 702–706. https://doi.org/10/gq2f47