Creating a dataset#

This example shows how to create a dataset for training a deep learning model.

In this example we will create a dataset that was used in our real-time paper [1].

from functools import partial
from pathlib import Path

import numpy as np
from scipy.signal import butter

from myoverse.datasets.filters.emg_augmentations import WaveletDecomposition
from myoverse.datasets.filters.generic import (
    ApplyFunctionFilter,
    IndexDataFilter,
    IdentityFilter,
)
from myoverse.datasets.filters.temporal import SOSFrequencyFilter, RMSFilter
from myoverse.datasets.supervised import EMGDataset

# Example 1: Creating a dataset with specific filter pipelines using Zarr 3
dataset = EMGDataset(
    emg_data_path=Path(r"../data/emg.pkl").resolve(),
    ground_truth_data_path=Path(r"../data/kinematics.pkl").resolve(),
    ground_truth_data_type="kinematics",
    sampling_frequency=2044.0,
    tasks_to_use=["1", "2"],
    save_path=Path(r"../data/dataset.zarr").resolve(),
    emg_filter_pipeline_after_chunking=[
        [
            SOSFrequencyFilter(
                sos_filter_coefficients=butter(
                    4, [47.5, 52.5], "bandstop", output="sos", fs=2044
                ),
                is_output=True,
                name="Raw No Powerline (Bandstop 50 Hz)",
                input_is_chunked=True,
            ),
            SOSFrequencyFilter(
                sos_filter_coefficients=butter(4, 20, "lowpass", output="sos", fs=2044),
                is_output=True,
                name="Raw No High Freq (Lowpass 20 Hz)",
                input_is_chunked=True,
            ),
        ]
    ],
    emg_representations_to_filter_after_chunking=[["Last"]],
    ground_truth_filter_pipeline_before_chunking=[
        [
            ApplyFunctionFilter(
                function=np.reshape, newshape=(63, -1), input_is_chunked=False
            ),
            IndexDataFilter(indices=(slice(3, 63),), input_is_chunked=False),
        ]
    ],
    ground_truth_representations_to_filter_before_chunking=[["Input"]],
    ground_truth_filter_pipeline_after_chunking=[
        [
            ApplyFunctionFilter(
                function=partial(np.mean, axis=-1),
                is_output=True,
                name="Mean Kinematics per EMG Chunk",
                input_is_chunked=True,
            ),
        ]
    ],
    ground_truth_representations_to_filter_after_chunking=[["Last"]],
    chunk_size=192,
    chunk_shift=64,
    testing_split_ratio=0.3,
    validation_split_ratio=0.1,
    augmentation_pipelines=[
        [
            WaveletDecomposition(
                nr_of_grids=5, is_output=True, level=2, input_is_chunked=False
            )
        ]
    ],
    debug_level=1,  # Disable debug output
    silence_zarr_warnings=True,  # Silence zarr codec warnings
)

# Create the dataset
dataset.create_dataset()
────────────────────────── STARTING DATASET CREATION ───────────────────────────

                             Dataset Configuration
╭──────────────────────────────────┬───────────────────────────────────────────╮
│  Parameter                       │  Value                                    │
├──────────────────────────────────┼───────────────────────────────────────────┤
│  EMG data path                   │  /home/runner/work/MyoVerse/MyoVerse/ex…  │
│  Ground truth data path          │  /home/runner/work/MyoVerse/MyoVerse/ex…  │
│  Ground truth data type          │  kinematics                               │
│  Sampling frequency (Hz)         │  2044.0                                   │
│  Save path                       │  /home/runner/work/MyoVerse/MyoVerse/ex…  │
│  Chunk size                      │  192                                      │
│  Chunk shift                     │  64                                       │
│  Testing split ratio             │  0.3                                      │
│  Validation split ratio          │  0.1                                      │
│  Amount of chunks to augment at  │  250                                      │
│  once                            │                                           │
│  Debug level                     │  1                                        │
│  Silence Zarr warnings           │  True                                     │
╰──────────────────────────────────┴───────────────────────────────────────────╯

Processing 2 tasks: 1, 2

Dataset Structure
├── EMG Data
│   ├── Task 1: Shape (320, 20440)
│   └── Task 2: Shape (320, 20440)
└── Ground Truth Data
    ├── Task 1: Shape (21, 3, 20440)
    └── Task 2: Shape (21, 3, 20440)

─────────────────────────────── PROCESSING TASKS ───────────────────────────────

                                  Initial Data

╭──────── EMG Data Task 1 ────────╮
│  --                             │
│  EMGData                        │
│  Sampling frequency: 2044.0 Hz  │
│  (0) Input (320, 20440)         │
│  --                             │
╰─────────────────────────────────╯
╭─── Ground Truth Data Task 1 ────╮
│  --                             │
│  KinematicsData                 │
│  Sampling frequency: 2044.0 Hz  │
│  (0) Input (21, 3, 20440)       │
│  --                             │
╰─────────────────────────────────╯

                            Pre-Chunking Processing

▶ Applying ground truth filters before chunking...

                                Chunking Process

▶ Chunking EMG data...
▶ Chunking ground truth data...

──────────────────────────────── After Chunking ────────────────────────────────

╭─────────── Chunked EMG Data ────────────╮
│  --                                     │
│  EMGData                                │
│  Sampling frequency: 2044.0 Hz          │
│  (0) Input (320, 20440)                 │
│                                         │
│  Filter(s):                             │
│  (1 | 1) EMG_Chunkizer (317, 320, 192)  │
│  --                                     │
╰─────────────────────────────────────────╯
╭────────────── Chunked Ground Truth Data ──────────────╮
│  --                                                   │
│  KinematicsData                                       │
│  Sampling frequency: 2044.0 Hz                        │
│  (0) Input (21, 3, 20440)                             │
│                                                       │
│  Filter(s):                                           │
│  (1 | 1) ApplyFunctionFilter (63, 20440)              │
│  (2 | 1 -> 2) IndexDataFilter (60, 20440)             │
│  (3 | 1 -> 2 -> 3) ChunkizeDataFilter (317, 60, 192)  │
│  --                                                   │
╰───────────────────────────────────────────────────────╯

                            Post-Chunking Processing

▶ Applying EMG filters after chunking...
▶ Applying ground truth filters after chunking...

───────────────────────── After Filtering Chunked Data ─────────────────────────

╭───────────────────────── Filtered Chunked EMG Data ──────────────────────────╮
│  --                                                                          │
│  EMGData                                                                     │
│  Sampling frequency: 2044.0 Hz                                               │
│  (0) Input (320, 20440)                                                      │
│                                                                              │
│  Filter(s):                                                                  │
│  (1 | 1) EMG_Chunkizer (317, 320, 192)                                       │
│  (2 | 1 -> 2) (Output) Raw No Powerline (Bandstop 50 Hz) (317, 320, 192)     │
│  (3 | 1 -> 2 -> 3) (Output) Raw No High Freq (Lowpass 20 Hz) (317, 320,      │
│  192)                                                                        │
│  --                                                                          │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─────────────────── Filtered Chunked Ground Truth Data ────────────────────╮
│  --                                                                       │
│  KinematicsData                                                           │
│  Sampling frequency: 2044.0 Hz                                            │
│  (0) Input (21, 3, 20440)                                                 │
│                                                                           │
│  Filter(s):                                                               │
│  (1 | 1) ApplyFunctionFilter (63, 20440)                                  │
│  (2 | 1 -> 2) IndexDataFilter (60, 20440)                                 │
│  (3 | 1 -> 2 -> 3) ChunkizeDataFilter (317, 60, 192)                      │
│  (4 | 1 -> 2 -> 3 -> 4) (Output) Mean Kinematics per EMG Chunk (317, 60)  │
│  --                                                                       │
╰───────────────────────────────────────────────────────────────────────────╯

                                Dataset Creation
▶ Adding processed data to dataset...

Adding data with keys: ['Raw No Powerline (Bandstop 50 Hz)', 'Raw No High Freq
(Lowpass 20 Hz)']

Splitting representation: Raw No Powerline (Bandstop 50 Hz) with shape (317,
320, 192)
  Training shape: (223, 320, 192)
  Testing shape: (94, 320, 192)
  After validation split - Testing shape: (86, 320, 192)
  Validation shape: (8, 320, 192)

Splitting representation: Raw No High Freq (Lowpass 20 Hz) with shape (317, 320,
192)
  Training shape: (223, 320, 192)
  Testing shape: (94, 320, 192)
  After validation split - Testing shape: (86, 320, 192)
  Validation shape: (8, 320, 192)

          Dataset Split Sizes
╭───────────────────┬──────────────────╮
│  Split            │  Sizes           │
├───────────────────┼──────────────────┤
│  Training         │  [223, 223]      │
│  Testing          │  [86, 86]        │
│  Validation       │  [8, 8]          │
╰───────────────────┴──────────────────╯

Adding data with keys: ['Mean Kinematics per EMG Chunk']

Splitting representation: Mean Kinematics per EMG Chunk with shape (317, 60)
  Training shape: (223, 60)
  Testing shape: (94, 60)
  After validation split - Testing shape: (86, 60)
  Validation shape: (8, 60)

          Dataset Split Sizes
╭───────────────────────┬──────────────╮
│  Split                │  Sizes       │
├───────────────────────┼──────────────┤
│  Training             │  [223]       │
│  Testing              │  [86]        │
│  Validation           │  [8]         │
╰───────────────────────┴──────────────╯

                                  Initial Data

╭──────── EMG Data Task 2 ────────╮
│  --                             │
│  EMGData                        │
│  Sampling frequency: 2044.0 Hz  │
│  (0) Input (320, 20440)         │
│  --                             │
╰─────────────────────────────────╯
╭─── Ground Truth Data Task 2 ────╮
│  --                             │
│  KinematicsData                 │
│  Sampling frequency: 2044.0 Hz  │
│  (0) Input (21, 3, 20440)       │
│  --                             │
╰─────────────────────────────────╯

                            Pre-Chunking Processing

▶ Applying ground truth filters before chunking...

                                Chunking Process

▶ Chunking EMG data...
▶ Chunking ground truth data...

──────────────────────────────── After Chunking ────────────────────────────────

╭─────────── Chunked EMG Data ────────────╮
│  --                                     │
│  EMGData                                │
│  Sampling frequency: 2044.0 Hz          │
│  (0) Input (320, 20440)                 │
│                                         │
│  Filter(s):                             │
│  (1 | 1) EMG_Chunkizer (317, 320, 192)  │
│  --                                     │
╰─────────────────────────────────────────╯
╭────────────── Chunked Ground Truth Data ──────────────╮
│  --                                                   │
│  KinematicsData                                       │
│  Sampling frequency: 2044.0 Hz                        │
│  (0) Input (21, 3, 20440)                             │
│                                                       │
│  Filter(s):                                           │
│  (1 | 1) ApplyFunctionFilter (63, 20440)              │
│  (2 | 1 -> 2) IndexDataFilter (60, 20440)             │
│  (3 | 1 -> 2 -> 3) ChunkizeDataFilter (317, 60, 192)  │
│  --                                                   │
╰───────────────────────────────────────────────────────╯

                            Post-Chunking Processing

▶ Applying EMG filters after chunking...
▶ Applying ground truth filters after chunking...

───────────────────────── After Filtering Chunked Data ─────────────────────────

╭───────────────────────── Filtered Chunked EMG Data ──────────────────────────╮
│  --                                                                          │
│  EMGData                                                                     │
│  Sampling frequency: 2044.0 Hz                                               │
│  (0) Input (320, 20440)                                                      │
│                                                                              │
│  Filter(s):                                                                  │
│  (1 | 1) EMG_Chunkizer (317, 320, 192)                                       │
│  (2 | 1 -> 2) (Output) Raw No Powerline (Bandstop 50 Hz) (317, 320, 192)     │
│  (3 | 1 -> 2 -> 3) (Output) Raw No High Freq (Lowpass 20 Hz) (317, 320,      │
│  192)                                                                        │
│  --                                                                          │
╰──────────────────────────────────────────────────────────────────────────────╯
╭─────────────────── Filtered Chunked Ground Truth Data ────────────────────╮
│  --                                                                       │
│  KinematicsData                                                           │
│  Sampling frequency: 2044.0 Hz                                            │
│  (0) Input (21, 3, 20440)                                                 │
│                                                                           │
│  Filter(s):                                                               │
│  (1 | 1) ApplyFunctionFilter (63, 20440)                                  │
│  (2 | 1 -> 2) IndexDataFilter (60, 20440)                                 │
│  (3 | 1 -> 2 -> 3) ChunkizeDataFilter (317, 60, 192)                      │
│  (4 | 1 -> 2 -> 3 -> 4) (Output) Mean Kinematics per EMG Chunk (317, 60)  │
│  --                                                                       │
╰───────────────────────────────────────────────────────────────────────────╯

                                Dataset Creation
▶ Adding processed data to dataset...

Adding data with keys: ['Raw No Powerline (Bandstop 50 Hz)', 'Raw No High Freq
(Lowpass 20 Hz)']

Splitting representation: Raw No Powerline (Bandstop 50 Hz) with shape (317,
320, 192)
  Training shape: (223, 320, 192)
  Testing shape: (94, 320, 192)
  After validation split - Testing shape: (86, 320, 192)
  Validation shape: (8, 320, 192)

Splitting representation: Raw No High Freq (Lowpass 20 Hz) with shape (317, 320,
192)
  Training shape: (223, 320, 192)
  Testing shape: (94, 320, 192)
  After validation split - Testing shape: (86, 320, 192)
  Validation shape: (8, 320, 192)

          Dataset Split Sizes
╭───────────────────┬──────────────────╮
│  Split            │  Sizes           │
├───────────────────┼──────────────────┤
│  Training         │  [223, 223]      │
│  Testing          │  [86, 86]        │
│  Validation       │  [8, 8]          │
╰───────────────────┴──────────────────╯

Adding data with keys: ['Mean Kinematics per EMG Chunk']

Splitting representation: Mean Kinematics per EMG Chunk with shape (317, 60)
  Training shape: (223, 60)
  Testing shape: (94, 60)
  After validation split - Testing shape: (86, 60)
  Validation shape: (8, 60)

          Dataset Split Sizes
╭───────────────────────┬──────────────╮
│  Split                │  Sizes       │
├───────────────────────┼──────────────┤
│  Training             │  [223]       │
│  Testing              │  [86]        │
│  Validation           │  [8]         │
╰───────────────────────┴──────────────╯

  Processing task 2 (2/2) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% • 0:00:00
──────────────────────────── APPLYING AUGMENTATIONS ────────────────────────────

                 Augmentation Configuration
╭──────────────────────────────────┬────────────────────────╮
│  Parameter                       │  Value                 │
├──────────────────────────────────┼────────────────────────┤
│  Total augmentation pipelines    │  1                     │
│  Pipelines                       │  WaveletDecomposition  │
│  Chunks to augment at once       │  250                   │
│  Total training samples          │  446                   │
╰──────────────────────────────────┴────────────────────────╯

  Pipeline 1/1: Batch 2/2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% • 0:00:00
────────────────────────── DATASET CREATION COMPLETED ──────────────────────────

                      Dataset Summary
╭─────────────────────────────────────────┬────────────────╮
│  Metric                                 │  Value         │
├─────────────────────────────────────────┼────────────────┤
│  Total tasks                            │  2             │
│  Training samples                       │  892           │
│  Testing samples                        │  172           │
│  Validation samples                     │  16            │
│  Total dataset size                     │  1012.75 MB    │
│  Training split size                    │  836.45 MB     │
│  Testing split size                     │  161.29 MB     │
│  Validation split size                  │  15.00 MB      │
╰─────────────────────────────────────────┴────────────────╯

Dataset Structure
├── Training
│   └── EMG Representations
│       ├── Raw No High Freq (Lowpass 20 Hz): (892, 320, 192)
│       └── Raw No Powerline (Bandstop 50 Hz): (892, 320, 192)
├── Testing
│   └── EMG Representations
│       ├── Raw No High Freq (Lowpass 20 Hz): (172, 320, 192)
│       └── Raw No Powerline (Bandstop 50 Hz): (172, 320, 192)
└── Validation
    └── EMG Representations
        ├── Raw No High Freq (Lowpass 20 Hz): (16, 320, 192)
        └── Raw No Powerline (Bandstop 50 Hz): (16, 320, 192)

─────────────────── Dataset Creation Successfully Completed! ───────────────────

Default dataset are also available. Here is an example of how to use the EMBCDataset used in [2].

from myoverse.datasets.defaults import EMBCDataset

# Using the default EMBCDataset with Zarr 3
dataset = EMBCDataset(
    emg_data_path=Path(r"../data/emg.pkl").resolve(),
    ground_truth_data_path=Path(r"../data/kinematics.pkl").resolve(),
    save_path=Path(r"../data/dataset_embc.zarr").resolve(),
    tasks_to_use=["1", "2"],
    debug_level=1,
    silence_zarr_warnings=True,  # Silence zarr codec warnings
)

# Create the EMBC dataset
dataset.create_dataset()
────────────────────────── STARTING DATASET CREATION ───────────────────────────

                             Dataset Configuration
╭──────────────────────────────────┬───────────────────────────────────────────╮
│  Parameter                       │  Value                                    │
├──────────────────────────────────┼───────────────────────────────────────────┤
│  EMG data path                   │  /home/runner/work/MyoVerse/MyoVerse/ex…  │
│  Ground truth data path          │  /home/runner/work/MyoVerse/MyoVerse/ex…  │
│  Ground truth data type          │  kinematics                               │
│  Sampling frequency (Hz)         │  2048.0                                   │
│  Save path                       │  /home/runner/work/MyoVerse/MyoVerse/ex…  │
│  Chunk size                      │  192                                      │
│  Chunk shift                     │  64                                       │
│  Testing split ratio             │  0.2                                      │
│  Validation split ratio          │  0.2                                      │
│  Amount of chunks to augment at  │  500                                      │
│  once                            │                                           │
│  Debug level                     │  1                                        │
│  Silence Zarr warnings           │  True                                     │
╰──────────────────────────────────┴───────────────────────────────────────────╯

Processing 2 tasks: 1, 2

Dataset Structure
├── EMG Data
│   ├── Task 1: Shape (320, 20440)
│   └── Task 2: Shape (320, 20440)
└── Ground Truth Data
    ├── Task 1: Shape (21, 3, 20440)
    └── Task 2: Shape (21, 3, 20440)

─────────────────────────────── PROCESSING TASKS ───────────────────────────────

                                  Initial Data

╭──────── EMG Data Task 1 ────────╮
│  --                             │
│  EMGData                        │
│  Sampling frequency: 2048.0 Hz  │
│  (0) Input (320, 20440)         │
│  --                             │
╰─────────────────────────────────╯
╭─── Ground Truth Data Task 1 ────╮
│  --                             │
│  KinematicsData                 │
│  Sampling frequency: 2048.0 Hz  │
│  (0) Input (21, 3, 20440)       │
│  --                             │
╰─────────────────────────────────╯

                            Pre-Chunking Processing

▶ Applying ground truth filters before chunking...

                                Chunking Process

▶ Chunking EMG data...
▶ Chunking ground truth data...

──────────────────────────────── After Chunking ────────────────────────────────

╭─────────── Chunked EMG Data ────────────╮
│  --                                     │
│  EMGData                                │
│  Sampling frequency: 2048.0 Hz          │
│  (0) Input (320, 20440)                 │
│                                         │
│  Filter(s):                             │
│  (1 | 1) EMG_Chunkizer (317, 320, 192)  │
│  --                                     │
╰─────────────────────────────────────────╯
╭────────────── Chunked Ground Truth Data ──────────────╮
│  --                                                   │
│  KinematicsData                                       │
│  Sampling frequency: 2048.0 Hz                        │
│  (0) Input (21, 3, 20440)                             │
│                                                       │
│  Filter(s):                                           │
│  (1 | 1) Reshape (63, 20440)                          │
│  (2 | 1 -> 2) IndexDataFilter (60, 20440)             │
│  (3 | 1 -> 2 -> 3) ChunkizeDataFilter (317, 60, 192)  │
│  --                                                   │
╰───────────────────────────────────────────────────────╯

                            Post-Chunking Processing

▶ Applying EMG filters after chunking...
▶ Applying ground truth filters after chunking...

───────────────────────── After Filtering Chunked Data ─────────────────────────

╭─────────────────── Filtered Chunked EMG Data ───────────────────╮
│  --                                                             │
│  EMGData                                                        │
│  Sampling frequency: 2048.0 Hz                                  │
│  (0) Input (320, 20440)                                         │
│                                                                 │
│  Filter(s):                                                     │
│  (1 | 1) EMG_Chunkizer (317, 320, 192)                          │
│  (2 | 1 -> 2) (Output) raw (317, 320, 192)                      │
│  (3 | 1 -> 2 -> 3) (Output) SOSFrequencyFilter (317, 320, 192)  │
│  --                                                             │
╰─────────────────────────────────────────────────────────────────╯
╭───────── Filtered Chunked Ground Truth Data ──────────╮
│  --                                                   │
│  KinematicsData                                       │
│  Sampling frequency: 2048.0 Hz                        │
│  (0) Input (21, 3, 20440)                             │
│                                                       │
│  Filter(s):                                           │
│  (1 | 1) Reshape (63, 20440)                          │
│  (2 | 1 -> 2) IndexDataFilter (60, 20440)             │
│  (3 | 1 -> 2 -> 3) ChunkizeDataFilter (317, 60, 192)  │
│  (4 | 1 -> 2 -> 3 -> 4) (Output) Mean (317, 60)       │
│  --                                                   │
╰───────────────────────────────────────────────────────╯

                                Dataset Creation
▶ Adding processed data to dataset...

Adding data with keys: ['raw', 'SOSFrequencyFilter']

Splitting representation: raw with shape (317, 320, 192)
  Training shape: (255, 320, 192)
  Testing shape: (62, 320, 192)
  After validation split - Testing shape: (50, 320, 192)
  Validation shape: (12, 320, 192)

Splitting representation: SOSFrequencyFilter with shape (317, 320, 192)
  Training shape: (255, 320, 192)
  Testing shape: (62, 320, 192)
  After validation split - Testing shape: (50, 320, 192)
  Validation shape: (12, 320, 192)

          Dataset Split Sizes
╭───────────────────┬──────────────────╮
│  Split            │  Sizes           │
├───────────────────┼──────────────────┤
│  Training         │  [255, 255]      │
│  Testing          │  [50, 50]        │
│  Validation       │  [12, 12]        │
╰───────────────────┴──────────────────╯

Adding data with keys: ['Mean']

Splitting representation: Mean with shape (317, 60)
  Training shape: (255, 60)
  Testing shape: (62, 60)
  After validation split - Testing shape: (50, 60)
  Validation shape: (12, 60)

          Dataset Split Sizes
╭───────────────────────┬──────────────╮
│  Split                │  Sizes       │
├───────────────────────┼──────────────┤
│  Training             │  [255]       │
│  Testing              │  [50]        │
│  Validation           │  [12]        │
╰───────────────────────┴──────────────╯

                                  Initial Data

╭──────── EMG Data Task 2 ────────╮
│  --                             │
│  EMGData                        │
│  Sampling frequency: 2048.0 Hz  │
│  (0) Input (320, 20440)         │
│  --                             │
╰─────────────────────────────────╯
╭─── Ground Truth Data Task 2 ────╮
│  --                             │
│  KinematicsData                 │
│  Sampling frequency: 2048.0 Hz  │
│  (0) Input (21, 3, 20440)       │
│  --                             │
╰─────────────────────────────────╯

                            Pre-Chunking Processing

▶ Applying ground truth filters before chunking...

                                Chunking Process

▶ Chunking EMG data...
▶ Chunking ground truth data...

──────────────────────────────── After Chunking ────────────────────────────────

╭─────────── Chunked EMG Data ────────────╮
│  --                                     │
│  EMGData                                │
│  Sampling frequency: 2048.0 Hz          │
│  (0) Input (320, 20440)                 │
│                                         │
│  Filter(s):                             │
│  (1 | 1) EMG_Chunkizer (317, 320, 192)  │
│  --                                     │
╰─────────────────────────────────────────╯
╭────────────── Chunked Ground Truth Data ──────────────╮
│  --                                                   │
│  KinematicsData                                       │
│  Sampling frequency: 2048.0 Hz                        │
│  (0) Input (21, 3, 20440)                             │
│                                                       │
│  Filter(s):                                           │
│  (1 | 1) Reshape (63, 20440)                          │
│  (2 | 1 -> 2) IndexDataFilter (60, 20440)             │
│  (3 | 1 -> 2 -> 3) ChunkizeDataFilter (317, 60, 192)  │
│  --                                                   │
╰───────────────────────────────────────────────────────╯

                            Post-Chunking Processing

▶ Applying EMG filters after chunking...
▶ Applying ground truth filters after chunking...

───────────────────────── After Filtering Chunked Data ─────────────────────────

╭─────────────────── Filtered Chunked EMG Data ───────────────────╮
│  --                                                             │
│  EMGData                                                        │
│  Sampling frequency: 2048.0 Hz                                  │
│  (0) Input (320, 20440)                                         │
│                                                                 │
│  Filter(s):                                                     │
│  (1 | 1) EMG_Chunkizer (317, 320, 192)                          │
│  (2 | 1 -> 2) (Output) raw (317, 320, 192)                      │
│  (3 | 1 -> 2 -> 3) (Output) SOSFrequencyFilter (317, 320, 192)  │
│  --                                                             │
╰─────────────────────────────────────────────────────────────────╯
╭───────── Filtered Chunked Ground Truth Data ──────────╮
│  --                                                   │
│  KinematicsData                                       │
│  Sampling frequency: 2048.0 Hz                        │
│  (0) Input (21, 3, 20440)                             │
│                                                       │
│  Filter(s):                                           │
│  (1 | 1) Reshape (63, 20440)                          │
│  (2 | 1 -> 2) IndexDataFilter (60, 20440)             │
│  (3 | 1 -> 2 -> 3) ChunkizeDataFilter (317, 60, 192)  │
│  (4 | 1 -> 2 -> 3 -> 4) (Output) Mean (317, 60)       │
│  --                                                   │
╰───────────────────────────────────────────────────────╯

                                Dataset Creation
▶ Adding processed data to dataset...

Adding data with keys: ['raw', 'SOSFrequencyFilter']

Splitting representation: raw with shape (317, 320, 192)
  Training shape: (255, 320, 192)
  Testing shape: (62, 320, 192)
  After validation split - Testing shape: (50, 320, 192)
  Validation shape: (12, 320, 192)

Splitting representation: SOSFrequencyFilter with shape (317, 320, 192)
  Training shape: (255, 320, 192)
  Testing shape: (62, 320, 192)
  After validation split - Testing shape: (50, 320, 192)
  Validation shape: (12, 320, 192)

          Dataset Split Sizes
╭───────────────────┬──────────────────╮
│  Split            │  Sizes           │
├───────────────────┼──────────────────┤
│  Training         │  [255, 255]      │
│  Testing          │  [50, 50]        │
│  Validation       │  [12, 12]        │
╰───────────────────┴──────────────────╯

Adding data with keys: ['Mean']

Splitting representation: Mean with shape (317, 60)
  Training shape: (255, 60)
  Testing shape: (62, 60)
  After validation split - Testing shape: (50, 60)
  Validation shape: (12, 60)

          Dataset Split Sizes
╭───────────────────────┬──────────────╮
│  Split                │  Sizes       │
├───────────────────────┼──────────────┤
│  Training             │  [255]       │
│  Testing              │  [50]        │
│  Validation           │  [12]        │
╰───────────────────────┴──────────────╯

  Processing task 2 (2/2) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% • 0:00:00
──────────────────────────── APPLYING AUGMENTATIONS ────────────────────────────

                 Augmentation Configuration
╭──────────────────────────────────┬────────────────────────╮
│  Parameter                       │  Value                 │
├──────────────────────────────────┼────────────────────────┤
│  Total augmentation pipelines    │  3                     │
│  Pipelines                       │  GaussianNoise         │
│                                  │  MagnitudeWarping      │
│                                  │  WaveletDecomposition  │
│  Chunks to augment at once       │  500                   │
│  Total training samples          │  510                   │
╰──────────────────────────────────┴────────────────────────╯

  Pipeline 3/3: Batch 5/5 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% • 0:00:00
────────────────────────── DATASET CREATION COMPLETED ──────────────────────────

                      Dataset Summary
╭─────────────────────────────────────────┬────────────────╮
│  Metric                                 │  Value         │
├─────────────────────────────────────────┼────────────────┤
│  Total tasks                            │  2             │
│  Training samples                       │  4080          │
│  Testing samples                        │  100           │
│  Validation samples                     │  24            │
│  Total dataset size                     │  2464.24 MB    │
│  Training split size                    │  2391.56 MB    │
│  Testing split size                     │  58.62 MB      │
│  Validation split size                  │  14.07 MB      │
╰─────────────────────────────────────────┴────────────────╯

Dataset Structure
├── Training
│   └── EMG Representations
│       ├── SOSFrequencyFilter: (4080, 320, 192)
│       └── raw: (4080, 320, 192)
├── Testing
│   └── EMG Representations
│       ├── SOSFrequencyFilter: (100, 320, 192)
│       └── raw: (100, 320, 192)
└── Validation
    └── EMG Representations
        ├── SOSFrequencyFilter: (24, 320, 192)
        └── raw: (24, 320, 192)

─────────────────── Dataset Creation Successfully Completed! ───────────────────

Total running time of the script: (0 minutes 46.263 seconds)

Estimated memory usage: 1947 MB

Gallery generated by Sphinx-Gallery