Note
Go to the end to download the full example code.
Creating a dataset#
This example shows how to create a dataset for training a deep learning model.
In this example we will create a dataset that was used in our real-time paper [1].
from functools import partial
from pathlib import Path
import numpy as np
from scipy.signal import butter
from myoverse.datasets.filters.emg_augmentations import WaveletDecomposition
from myoverse.datasets.filters.generic import (
ApplyFunctionFilter,
IndexDataFilter,
IdentityFilter,
)
from myoverse.datasets.filters.temporal import SOSFrequencyFilter, RMSFilter
from myoverse.datasets.supervised import EMGDataset
# Example 1: Creating a dataset with specific filter pipelines using Zarr 3
dataset = EMGDataset(
emg_data_path=Path(r"../data/emg.pkl").resolve(),
ground_truth_data_path=Path(r"../data/kinematics.pkl").resolve(),
ground_truth_data_type="kinematics",
sampling_frequency=2044.0,
tasks_to_use=["1", "2"],
save_path=Path(r"../data/dataset.zarr").resolve(),
emg_filter_pipeline_after_chunking=[
[
SOSFrequencyFilter(
sos_filter_coefficients=butter(
4, [47.5, 52.5], "bandstop", output="sos", fs=2044
),
is_output=True,
name="Raw No Powerline (Bandstop 50 Hz)",
input_is_chunked=True,
),
SOSFrequencyFilter(
sos_filter_coefficients=butter(4, 20, "lowpass", output="sos", fs=2044),
is_output=True,
name="Raw No High Freq (Lowpass 20 Hz)",
input_is_chunked=True,
),
]
],
emg_representations_to_filter_after_chunking=[["Last"]],
ground_truth_filter_pipeline_before_chunking=[
[
ApplyFunctionFilter(
function=np.reshape, newshape=(63, -1), input_is_chunked=False
),
IndexDataFilter(indices=(slice(3, 63),), input_is_chunked=False),
]
],
ground_truth_representations_to_filter_before_chunking=[["Input"]],
ground_truth_filter_pipeline_after_chunking=[
[
ApplyFunctionFilter(
function=partial(np.mean, axis=-1),
is_output=True,
name="Mean Kinematics per EMG Chunk",
input_is_chunked=True,
),
]
],
ground_truth_representations_to_filter_after_chunking=[["Last"]],
chunk_size=192,
chunk_shift=64,
testing_split_ratio=0.3,
validation_split_ratio=0.1,
augmentation_pipelines=[
[
WaveletDecomposition(
nr_of_grids=5, is_output=True, level=2, input_is_chunked=False
)
]
],
debug_level=1, # Disable debug output
silence_zarr_warnings=True, # Silence zarr codec warnings
)
# Create the dataset
dataset.create_dataset()
────────────────────────── STARTING DATASET CREATION ───────────────────────────
Dataset Configuration
╭──────────────────────────────────┬───────────────────────────────────────────╮
│ Parameter │ Value │
├──────────────────────────────────┼───────────────────────────────────────────┤
│ EMG data path │ /home/runner/work/MyoVerse/MyoVerse/ex… │
│ Ground truth data path │ /home/runner/work/MyoVerse/MyoVerse/ex… │
│ Ground truth data type │ kinematics │
│ Sampling frequency (Hz) │ 2044.0 │
│ Save path │ /home/runner/work/MyoVerse/MyoVerse/ex… │
│ Chunk size │ 192 │
│ Chunk shift │ 64 │
│ Testing split ratio │ 0.3 │
│ Validation split ratio │ 0.1 │
│ Amount of chunks to augment at │ 250 │
│ once │ │
│ Debug level │ 1 │
│ Silence Zarr warnings │ True │
╰──────────────────────────────────┴───────────────────────────────────────────╯
Processing 2 tasks: 1, 2
Dataset Structure
├── EMG Data
│ ├── Task 1: Shape (320, 20440)
│ └── Task 2: Shape (320, 20440)
└── Ground Truth Data
├── Task 1: Shape (21, 3, 20440)
└── Task 2: Shape (21, 3, 20440)
─────────────────────────────── PROCESSING TASKS ───────────────────────────────
Initial Data
╭──────── EMG Data Task 1 ────────╮
│ -- │
│ EMGData │
│ Sampling frequency: 2044.0 Hz │
│ (0) Input (320, 20440) │
│ -- │
╰─────────────────────────────────╯
╭─── Ground Truth Data Task 1 ────╮
│ -- │
│ KinematicsData │
│ Sampling frequency: 2044.0 Hz │
│ (0) Input (21, 3, 20440) │
│ -- │
╰─────────────────────────────────╯
Pre-Chunking Processing
▶ Applying ground truth filters before chunking...
Chunking Process
▶ Chunking EMG data...
▶ Chunking ground truth data...
──────────────────────────────── After Chunking ────────────────────────────────
╭─────────── Chunked EMG Data ────────────╮
│ -- │
│ EMGData │
│ Sampling frequency: 2044.0 Hz │
│ (0) Input (320, 20440) │
│ │
│ Filter(s): │
│ (1 | 1) EMG_Chunkizer (317, 320, 192) │
│ -- │
╰─────────────────────────────────────────╯
╭────────────── Chunked Ground Truth Data ──────────────╮
│ -- │
│ KinematicsData │
│ Sampling frequency: 2044.0 Hz │
│ (0) Input (21, 3, 20440) │
│ │
│ Filter(s): │
│ (1 | 1) ApplyFunctionFilter (63, 20440) │
│ (2 | 1 -> 2) IndexDataFilter (60, 20440) │
│ (3 | 1 -> 2 -> 3) ChunkizeDataFilter (317, 60, 192) │
│ -- │
╰───────────────────────────────────────────────────────╯
Post-Chunking Processing
▶ Applying EMG filters after chunking...
▶ Applying ground truth filters after chunking...
───────────────────────── After Filtering Chunked Data ─────────────────────────
╭───────────────────────── Filtered Chunked EMG Data ──────────────────────────╮
│ -- │
│ EMGData │
│ Sampling frequency: 2044.0 Hz │
│ (0) Input (320, 20440) │
│ │
│ Filter(s): │
│ (1 | 1) EMG_Chunkizer (317, 320, 192) │
│ (2 | 1 -> 2) (Output) Raw No Powerline (Bandstop 50 Hz) (317, 320, 192) │
│ (3 | 1 -> 2 -> 3) (Output) Raw No High Freq (Lowpass 20 Hz) (317, 320, │
│ 192) │
│ -- │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─────────────────── Filtered Chunked Ground Truth Data ────────────────────╮
│ -- │
│ KinematicsData │
│ Sampling frequency: 2044.0 Hz │
│ (0) Input (21, 3, 20440) │
│ │
│ Filter(s): │
│ (1 | 1) ApplyFunctionFilter (63, 20440) │
│ (2 | 1 -> 2) IndexDataFilter (60, 20440) │
│ (3 | 1 -> 2 -> 3) ChunkizeDataFilter (317, 60, 192) │
│ (4 | 1 -> 2 -> 3 -> 4) (Output) Mean Kinematics per EMG Chunk (317, 60) │
│ -- │
╰───────────────────────────────────────────────────────────────────────────╯
Dataset Creation
▶ Adding processed data to dataset...
Adding data with keys: ['Raw No Powerline (Bandstop 50 Hz)', 'Raw No High Freq
(Lowpass 20 Hz)']
Splitting representation: Raw No Powerline (Bandstop 50 Hz) with shape (317,
320, 192)
Training shape: (223, 320, 192)
Testing shape: (94, 320, 192)
After validation split - Testing shape: (86, 320, 192)
Validation shape: (8, 320, 192)
Splitting representation: Raw No High Freq (Lowpass 20 Hz) with shape (317, 320,
192)
Training shape: (223, 320, 192)
Testing shape: (94, 320, 192)
After validation split - Testing shape: (86, 320, 192)
Validation shape: (8, 320, 192)
Dataset Split Sizes
╭───────────────────┬──────────────────╮
│ Split │ Sizes │
├───────────────────┼──────────────────┤
│ Training │ [223, 223] │
│ Testing │ [86, 86] │
│ Validation │ [8, 8] │
╰───────────────────┴──────────────────╯
Adding data with keys: ['Mean Kinematics per EMG Chunk']
Splitting representation: Mean Kinematics per EMG Chunk with shape (317, 60)
Training shape: (223, 60)
Testing shape: (94, 60)
After validation split - Testing shape: (86, 60)
Validation shape: (8, 60)
Dataset Split Sizes
╭───────────────────────┬──────────────╮
│ Split │ Sizes │
├───────────────────────┼──────────────┤
│ Training │ [223] │
│ Testing │ [86] │
│ Validation │ [8] │
╰───────────────────────┴──────────────╯
Initial Data
╭──────── EMG Data Task 2 ────────╮
│ -- │
│ EMGData │
│ Sampling frequency: 2044.0 Hz │
│ (0) Input (320, 20440) │
│ -- │
╰─────────────────────────────────╯
╭─── Ground Truth Data Task 2 ────╮
│ -- │
│ KinematicsData │
│ Sampling frequency: 2044.0 Hz │
│ (0) Input (21, 3, 20440) │
│ -- │
╰─────────────────────────────────╯
Pre-Chunking Processing
▶ Applying ground truth filters before chunking...
Chunking Process
▶ Chunking EMG data...
▶ Chunking ground truth data...
──────────────────────────────── After Chunking ────────────────────────────────
╭─────────── Chunked EMG Data ────────────╮
│ -- │
│ EMGData │
│ Sampling frequency: 2044.0 Hz │
│ (0) Input (320, 20440) │
│ │
│ Filter(s): │
│ (1 | 1) EMG_Chunkizer (317, 320, 192) │
│ -- │
╰─────────────────────────────────────────╯
╭────────────── Chunked Ground Truth Data ──────────────╮
│ -- │
│ KinematicsData │
│ Sampling frequency: 2044.0 Hz │
│ (0) Input (21, 3, 20440) │
│ │
│ Filter(s): │
│ (1 | 1) ApplyFunctionFilter (63, 20440) │
│ (2 | 1 -> 2) IndexDataFilter (60, 20440) │
│ (3 | 1 -> 2 -> 3) ChunkizeDataFilter (317, 60, 192) │
│ -- │
╰───────────────────────────────────────────────────────╯
Post-Chunking Processing
▶ Applying EMG filters after chunking...
▶ Applying ground truth filters after chunking...
───────────────────────── After Filtering Chunked Data ─────────────────────────
╭───────────────────────── Filtered Chunked EMG Data ──────────────────────────╮
│ -- │
│ EMGData │
│ Sampling frequency: 2044.0 Hz │
│ (0) Input (320, 20440) │
│ │
│ Filter(s): │
│ (1 | 1) EMG_Chunkizer (317, 320, 192) │
│ (2 | 1 -> 2) (Output) Raw No Powerline (Bandstop 50 Hz) (317, 320, 192) │
│ (3 | 1 -> 2 -> 3) (Output) Raw No High Freq (Lowpass 20 Hz) (317, 320, │
│ 192) │
│ -- │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─────────────────── Filtered Chunked Ground Truth Data ────────────────────╮
│ -- │
│ KinematicsData │
│ Sampling frequency: 2044.0 Hz │
│ (0) Input (21, 3, 20440) │
│ │
│ Filter(s): │
│ (1 | 1) ApplyFunctionFilter (63, 20440) │
│ (2 | 1 -> 2) IndexDataFilter (60, 20440) │
│ (3 | 1 -> 2 -> 3) ChunkizeDataFilter (317, 60, 192) │
│ (4 | 1 -> 2 -> 3 -> 4) (Output) Mean Kinematics per EMG Chunk (317, 60) │
│ -- │
╰───────────────────────────────────────────────────────────────────────────╯
Dataset Creation
▶ Adding processed data to dataset...
Adding data with keys: ['Raw No Powerline (Bandstop 50 Hz)', 'Raw No High Freq
(Lowpass 20 Hz)']
Splitting representation: Raw No Powerline (Bandstop 50 Hz) with shape (317,
320, 192)
Training shape: (223, 320, 192)
Testing shape: (94, 320, 192)
After validation split - Testing shape: (86, 320, 192)
Validation shape: (8, 320, 192)
Splitting representation: Raw No High Freq (Lowpass 20 Hz) with shape (317, 320,
192)
Training shape: (223, 320, 192)
Testing shape: (94, 320, 192)
After validation split - Testing shape: (86, 320, 192)
Validation shape: (8, 320, 192)
Dataset Split Sizes
╭───────────────────┬──────────────────╮
│ Split │ Sizes │
├───────────────────┼──────────────────┤
│ Training │ [223, 223] │
│ Testing │ [86, 86] │
│ Validation │ [8, 8] │
╰───────────────────┴──────────────────╯
Adding data with keys: ['Mean Kinematics per EMG Chunk']
Splitting representation: Mean Kinematics per EMG Chunk with shape (317, 60)
Training shape: (223, 60)
Testing shape: (94, 60)
After validation split - Testing shape: (86, 60)
Validation shape: (8, 60)
Dataset Split Sizes
╭───────────────────────┬──────────────╮
│ Split │ Sizes │
├───────────────────────┼──────────────┤
│ Training │ [223] │
│ Testing │ [86] │
│ Validation │ [8] │
╰───────────────────────┴──────────────╯
Processing task 2 (2/2) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% • 0:00:00
──────────────────────────── APPLYING AUGMENTATIONS ────────────────────────────
Augmentation Configuration
╭──────────────────────────────────┬────────────────────────╮
│ Parameter │ Value │
├──────────────────────────────────┼────────────────────────┤
│ Total augmentation pipelines │ 1 │
│ Pipelines │ WaveletDecomposition │
│ Chunks to augment at once │ 250 │
│ Total training samples │ 446 │
╰──────────────────────────────────┴────────────────────────╯
Pipeline 1/1: Batch 2/2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% • 0:00:00
────────────────────────── DATASET CREATION COMPLETED ──────────────────────────
Dataset Summary
╭─────────────────────────────────────────┬────────────────╮
│ Metric │ Value │
├─────────────────────────────────────────┼────────────────┤
│ Total tasks │ 2 │
│ Training samples │ 892 │
│ Testing samples │ 172 │
│ Validation samples │ 16 │
│ Total dataset size │ 1012.75 MB │
│ Training split size │ 836.45 MB │
│ Testing split size │ 161.29 MB │
│ Validation split size │ 15.00 MB │
╰─────────────────────────────────────────┴────────────────╯
Dataset Structure
├── Training
│ └── EMG Representations
│ ├── Raw No High Freq (Lowpass 20 Hz): (892, 320, 192)
│ └── Raw No Powerline (Bandstop 50 Hz): (892, 320, 192)
├── Testing
│ └── EMG Representations
│ ├── Raw No High Freq (Lowpass 20 Hz): (172, 320, 192)
│ └── Raw No Powerline (Bandstop 50 Hz): (172, 320, 192)
└── Validation
└── EMG Representations
├── Raw No High Freq (Lowpass 20 Hz): (16, 320, 192)
└── Raw No Powerline (Bandstop 50 Hz): (16, 320, 192)
─────────────────── Dataset Creation Successfully Completed! ───────────────────
Default dataset are also available. Here is an example of how to use the EMBCDataset used in [2].
Sîmpetru, R.C., Osswald, M., Braun, D.I., Souza de Oliveira, D., Cakici, A.L., Del Vecchio, A., 2022. Accurate continuous prediction of 14 degrees of freedom of the hand from myoelectrical signals through convolutive deep learning, in: Proceedings of the 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). Presented at the 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pp. 702–706. https://doi.org/10/gq2f47
from myoverse.datasets.defaults import EMBCDataset
# Using the default EMBCDataset with Zarr 3
dataset = EMBCDataset(
emg_data_path=Path(r"../data/emg.pkl").resolve(),
ground_truth_data_path=Path(r"../data/kinematics.pkl").resolve(),
save_path=Path(r"../data/dataset_embc.zarr").resolve(),
tasks_to_use=["1", "2"],
debug_level=1,
silence_zarr_warnings=True, # Silence zarr codec warnings
)
# Create the EMBC dataset
dataset.create_dataset()
────────────────────────── STARTING DATASET CREATION ───────────────────────────
Dataset Configuration
╭──────────────────────────────────┬───────────────────────────────────────────╮
│ Parameter │ Value │
├──────────────────────────────────┼───────────────────────────────────────────┤
│ EMG data path │ /home/runner/work/MyoVerse/MyoVerse/ex… │
│ Ground truth data path │ /home/runner/work/MyoVerse/MyoVerse/ex… │
│ Ground truth data type │ kinematics │
│ Sampling frequency (Hz) │ 2048.0 │
│ Save path │ /home/runner/work/MyoVerse/MyoVerse/ex… │
│ Chunk size │ 192 │
│ Chunk shift │ 64 │
│ Testing split ratio │ 0.2 │
│ Validation split ratio │ 0.2 │
│ Amount of chunks to augment at │ 500 │
│ once │ │
│ Debug level │ 1 │
│ Silence Zarr warnings │ True │
╰──────────────────────────────────┴───────────────────────────────────────────╯
Processing 2 tasks: 1, 2
Dataset Structure
├── EMG Data
│ ├── Task 1: Shape (320, 20440)
│ └── Task 2: Shape (320, 20440)
└── Ground Truth Data
├── Task 1: Shape (21, 3, 20440)
└── Task 2: Shape (21, 3, 20440)
─────────────────────────────── PROCESSING TASKS ───────────────────────────────
Initial Data
╭──────── EMG Data Task 1 ────────╮
│ -- │
│ EMGData │
│ Sampling frequency: 2048.0 Hz │
│ (0) Input (320, 20440) │
│ -- │
╰─────────────────────────────────╯
╭─── Ground Truth Data Task 1 ────╮
│ -- │
│ KinematicsData │
│ Sampling frequency: 2048.0 Hz │
│ (0) Input (21, 3, 20440) │
│ -- │
╰─────────────────────────────────╯
Pre-Chunking Processing
▶ Applying ground truth filters before chunking...
Chunking Process
▶ Chunking EMG data...
▶ Chunking ground truth data...
──────────────────────────────── After Chunking ────────────────────────────────
╭─────────── Chunked EMG Data ────────────╮
│ -- │
│ EMGData │
│ Sampling frequency: 2048.0 Hz │
│ (0) Input (320, 20440) │
│ │
│ Filter(s): │
│ (1 | 1) EMG_Chunkizer (317, 320, 192) │
│ -- │
╰─────────────────────────────────────────╯
╭────────────── Chunked Ground Truth Data ──────────────╮
│ -- │
│ KinematicsData │
│ Sampling frequency: 2048.0 Hz │
│ (0) Input (21, 3, 20440) │
│ │
│ Filter(s): │
│ (1 | 1) Reshape (63, 20440) │
│ (2 | 1 -> 2) IndexDataFilter (60, 20440) │
│ (3 | 1 -> 2 -> 3) ChunkizeDataFilter (317, 60, 192) │
│ -- │
╰───────────────────────────────────────────────────────╯
Post-Chunking Processing
▶ Applying EMG filters after chunking...
▶ Applying ground truth filters after chunking...
───────────────────────── After Filtering Chunked Data ─────────────────────────
╭─────────────────── Filtered Chunked EMG Data ───────────────────╮
│ -- │
│ EMGData │
│ Sampling frequency: 2048.0 Hz │
│ (0) Input (320, 20440) │
│ │
│ Filter(s): │
│ (1 | 1) EMG_Chunkizer (317, 320, 192) │
│ (2 | 1 -> 2) (Output) raw (317, 320, 192) │
│ (3 | 1 -> 2 -> 3) (Output) SOSFrequencyFilter (317, 320, 192) │
│ -- │
╰─────────────────────────────────────────────────────────────────╯
╭───────── Filtered Chunked Ground Truth Data ──────────╮
│ -- │
│ KinematicsData │
│ Sampling frequency: 2048.0 Hz │
│ (0) Input (21, 3, 20440) │
│ │
│ Filter(s): │
│ (1 | 1) Reshape (63, 20440) │
│ (2 | 1 -> 2) IndexDataFilter (60, 20440) │
│ (3 | 1 -> 2 -> 3) ChunkizeDataFilter (317, 60, 192) │
│ (4 | 1 -> 2 -> 3 -> 4) (Output) Mean (317, 60) │
│ -- │
╰───────────────────────────────────────────────────────╯
Dataset Creation
▶ Adding processed data to dataset...
Adding data with keys: ['raw', 'SOSFrequencyFilter']
Splitting representation: raw with shape (317, 320, 192)
Training shape: (255, 320, 192)
Testing shape: (62, 320, 192)
After validation split - Testing shape: (50, 320, 192)
Validation shape: (12, 320, 192)
Splitting representation: SOSFrequencyFilter with shape (317, 320, 192)
Training shape: (255, 320, 192)
Testing shape: (62, 320, 192)
After validation split - Testing shape: (50, 320, 192)
Validation shape: (12, 320, 192)
Dataset Split Sizes
╭───────────────────┬──────────────────╮
│ Split │ Sizes │
├───────────────────┼──────────────────┤
│ Training │ [255, 255] │
│ Testing │ [50, 50] │
│ Validation │ [12, 12] │
╰───────────────────┴──────────────────╯
Adding data with keys: ['Mean']
Splitting representation: Mean with shape (317, 60)
Training shape: (255, 60)
Testing shape: (62, 60)
After validation split - Testing shape: (50, 60)
Validation shape: (12, 60)
Dataset Split Sizes
╭───────────────────────┬──────────────╮
│ Split │ Sizes │
├───────────────────────┼──────────────┤
│ Training │ [255] │
│ Testing │ [50] │
│ Validation │ [12] │
╰───────────────────────┴──────────────╯
Initial Data
╭──────── EMG Data Task 2 ────────╮
│ -- │
│ EMGData │
│ Sampling frequency: 2048.0 Hz │
│ (0) Input (320, 20440) │
│ -- │
╰─────────────────────────────────╯
╭─── Ground Truth Data Task 2 ────╮
│ -- │
│ KinematicsData │
│ Sampling frequency: 2048.0 Hz │
│ (0) Input (21, 3, 20440) │
│ -- │
╰─────────────────────────────────╯
Pre-Chunking Processing
▶ Applying ground truth filters before chunking...
Chunking Process
▶ Chunking EMG data...
▶ Chunking ground truth data...
──────────────────────────────── After Chunking ────────────────────────────────
╭─────────── Chunked EMG Data ────────────╮
│ -- │
│ EMGData │
│ Sampling frequency: 2048.0 Hz │
│ (0) Input (320, 20440) │
│ │
│ Filter(s): │
│ (1 | 1) EMG_Chunkizer (317, 320, 192) │
│ -- │
╰─────────────────────────────────────────╯
╭────────────── Chunked Ground Truth Data ──────────────╮
│ -- │
│ KinematicsData │
│ Sampling frequency: 2048.0 Hz │
│ (0) Input (21, 3, 20440) │
│ │
│ Filter(s): │
│ (1 | 1) Reshape (63, 20440) │
│ (2 | 1 -> 2) IndexDataFilter (60, 20440) │
│ (3 | 1 -> 2 -> 3) ChunkizeDataFilter (317, 60, 192) │
│ -- │
╰───────────────────────────────────────────────────────╯
Post-Chunking Processing
▶ Applying EMG filters after chunking...
▶ Applying ground truth filters after chunking...
───────────────────────── After Filtering Chunked Data ─────────────────────────
╭─────────────────── Filtered Chunked EMG Data ───────────────────╮
│ -- │
│ EMGData │
│ Sampling frequency: 2048.0 Hz │
│ (0) Input (320, 20440) │
│ │
│ Filter(s): │
│ (1 | 1) EMG_Chunkizer (317, 320, 192) │
│ (2 | 1 -> 2) (Output) raw (317, 320, 192) │
│ (3 | 1 -> 2 -> 3) (Output) SOSFrequencyFilter (317, 320, 192) │
│ -- │
╰─────────────────────────────────────────────────────────────────╯
╭───────── Filtered Chunked Ground Truth Data ──────────╮
│ -- │
│ KinematicsData │
│ Sampling frequency: 2048.0 Hz │
│ (0) Input (21, 3, 20440) │
│ │
│ Filter(s): │
│ (1 | 1) Reshape (63, 20440) │
│ (2 | 1 -> 2) IndexDataFilter (60, 20440) │
│ (3 | 1 -> 2 -> 3) ChunkizeDataFilter (317, 60, 192) │
│ (4 | 1 -> 2 -> 3 -> 4) (Output) Mean (317, 60) │
│ -- │
╰───────────────────────────────────────────────────────╯
Dataset Creation
▶ Adding processed data to dataset...
Adding data with keys: ['raw', 'SOSFrequencyFilter']
Splitting representation: raw with shape (317, 320, 192)
Training shape: (255, 320, 192)
Testing shape: (62, 320, 192)
After validation split - Testing shape: (50, 320, 192)
Validation shape: (12, 320, 192)
Splitting representation: SOSFrequencyFilter with shape (317, 320, 192)
Training shape: (255, 320, 192)
Testing shape: (62, 320, 192)
After validation split - Testing shape: (50, 320, 192)
Validation shape: (12, 320, 192)
Dataset Split Sizes
╭───────────────────┬──────────────────╮
│ Split │ Sizes │
├───────────────────┼──────────────────┤
│ Training │ [255, 255] │
│ Testing │ [50, 50] │
│ Validation │ [12, 12] │
╰───────────────────┴──────────────────╯
Adding data with keys: ['Mean']
Splitting representation: Mean with shape (317, 60)
Training shape: (255, 60)
Testing shape: (62, 60)
After validation split - Testing shape: (50, 60)
Validation shape: (12, 60)
Dataset Split Sizes
╭───────────────────────┬──────────────╮
│ Split │ Sizes │
├───────────────────────┼──────────────┤
│ Training │ [255] │
│ Testing │ [50] │
│ Validation │ [12] │
╰───────────────────────┴──────────────╯
Processing task 2 (2/2) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% • 0:00:00
──────────────────────────── APPLYING AUGMENTATIONS ────────────────────────────
Augmentation Configuration
╭──────────────────────────────────┬────────────────────────╮
│ Parameter │ Value │
├──────────────────────────────────┼────────────────────────┤
│ Total augmentation pipelines │ 3 │
│ Pipelines │ GaussianNoise │
│ │ MagnitudeWarping │
│ │ WaveletDecomposition │
│ Chunks to augment at once │ 500 │
│ Total training samples │ 510 │
╰──────────────────────────────────┴────────────────────────╯
Pipeline 3/3: Batch 5/5 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% • 0:00:00
────────────────────────── DATASET CREATION COMPLETED ──────────────────────────
Dataset Summary
╭─────────────────────────────────────────┬────────────────╮
│ Metric │ Value │
├─────────────────────────────────────────┼────────────────┤
│ Total tasks │ 2 │
│ Training samples │ 4080 │
│ Testing samples │ 100 │
│ Validation samples │ 24 │
│ Total dataset size │ 2464.24 MB │
│ Training split size │ 2391.56 MB │
│ Testing split size │ 58.62 MB │
│ Validation split size │ 14.07 MB │
╰─────────────────────────────────────────┴────────────────╯
Dataset Structure
├── Training
│ └── EMG Representations
│ ├── SOSFrequencyFilter: (4080, 320, 192)
│ └── raw: (4080, 320, 192)
├── Testing
│ └── EMG Representations
│ ├── SOSFrequencyFilter: (100, 320, 192)
│ └── raw: (100, 320, 192)
└── Validation
└── EMG Representations
├── SOSFrequencyFilter: (24, 320, 192)
└── raw: (24, 320, 192)
─────────────────── Dataset Creation Successfully Completed! ───────────────────
Total running time of the script: (0 minutes 46.263 seconds)
Estimated memory usage: 1947 MB