_Data#

class myoverse.datatypes._Data(raw_data, sampling_frequency, nr_of_dimensions_when_unchunked)[source]#

Base class for all data types.

This class provides common functionality for handling different types of data, including maintaining original and processed representations, tracking filters applied, and managing data flow.

Parameters:

raw_data (np.ndarray) – The raw data to store.
sampling_frequency (float) – The sampling frequency of the data.
nr_of_dimensions_when_unchunked (int)

sampling_frequency#

The sampling frequency of the data.

Type:: float

_last_processing_step#

The last processing step applied to the data.

Type:: str

_processed_representations#

The graph of the processed representations.

Type:: networkx.DiGraph

_filters_used#

Dictionary of all filters used in the data. The keys are the names of the filters and the values are the filters themselves.

Type:: Dict[str, FilterBaseClass]

_data#

Dictionary of all data. The keys are the names of the representations and the values are either numpy arrays or DeletedRepresentation objects (for representations that have been deleted to save memory but can be regenerated when needed).

Type:: Dict[str, Union[np.ndarray, DeletedRepresentation]]

Raises:

ValueError – If the sampling frequency is less than or equal to 0.

Parameters:

raw_data (ndarray)
sampling_frequency (float)
nr_of_dimensions_when_unchunked (int)

Notes

Memory Management:: When representations are deleted with delete_data(), they are replaced with DeletedRepresentation objects that store essential metadata (shape, dtype) but don’t consume memory for the actual data. These representations can be automatically recomputed when accessed. The chunking status is determined from the shape when needed.

Examples

This is an abstract base class and should not be instantiated directly. Instead, use one of the concrete subclasses like EMGData or KinematicsData:

>>> import numpy as np
>>> from myoverse.datatypes import EMGData
>>>
>>> # Create sample data
>>> data = np.random.randn(16, 1000)
>>> emg = EMGData(data, 2000)  # 2000 Hz sampling rate
>>>
>>> # Access attributes from the base _Data class
>>> print(f"Sampling frequency: {emg.sampling_frequency} Hz")
>>> print(f"Is input data chunked: {emg.is_chunked['Input']}")

Methods

`__copy__`()	Create a shallow copy of the instance.
`__getitem__`(key)
`__init__`(raw_data, sampling_frequency, ...)
`__repr__`()	Return repr(self).
`__setitem__`(key, value)
`__str__`()	Return str(self).
`_check_if_chunked`(data)	Checks if the data is chunked or not.
`apply_filter`(filter[, ...])	Applies a filter to the data.
`apply_filter_pipeline`(filter_pipeline, ...)	Applies a pipeline of filters to the data.
`apply_filter_sequence`(filter_sequence[, ...])	Applies a sequence of filters to the data sequentially.
`delete`(representation_to_delete)	Delete both the data and history for a representation.
`delete_data`(representation_to_delete)	Delete data from a representation while keeping its metadata.
`delete_history`(representation_to_delete)	Delete the processing history for a representation.
`get_representation_history`(representation)	Returns the history of a representation.
`load`(filename)	Load data from a file.
`memory_usage`()	Calculate memory usage of each representation.
`plot`(_, *__)	Plots the data.
`plot_graph`([title])	Draws the graph of the processed representations.
`save`(filename)	Save the data to a file.

classmethod load(filename)[source]#

Load data from a file.

Parameters:: filename (str) – The name of the file to load the data from.
Returns:: The loaded data.
Return type:: _Data

apply_filter(filter, representations_to_filter=None, keep_representation_to_filter=True)[source]#

Applies a filter to the data.

Parameters:

filter (callable) – The filter to apply.
representations_to_filter (list[str], optional) – A list of representations to filter. The filter is responsible for handling the appropriate number of inputs or raising an error if incompatible. If None, creates an empty list.
keep_representation_to_filter (bool) – Whether to keep the representation(s) to filter or not. If the representation to filter is “Input”, this parameter is ignored.

Returns:

The name of the representation after applying the filter.

Return type:

str

Raises:

ValueError – If representations_to_filter is a string instead of a list
TypeError – If a filter returns a dictionary (no longer supported)

apply_filter_pipeline(filter_pipeline, representations_to_filter, keep_individual_filter_steps=True, keep_representation_to_filter=True)[source]#

Applies a pipeline of filters to the data.

Parameters:

filter_pipeline (list[list[FilterBaseClass]]) – The pipeline of filters to apply. Each inner list represents a branch of filters.
representations_to_filter (list[list[str]]) –
A list of input representations for each branch. Each element corresponds to a branch in the filter_pipeline and must be: - A list with a single string for standard branches that take one input - A list with multiple strings for branches starting with a multi-input filter - An empty list is not allowed unless the filter explicitly accepts no input

Note

The length of the representations_to_filter should be the same as the length of the amount of branches in the filter_pipeline.
keep_individual_filter_steps (bool) – Whether to keep the results of each filter or not.
keep_representation_to_filter (bool) – Whether to keep the representation(s) to filter or not. If the representation to filter is “Input”, this parameter is ignored.

Returns:

A list containing the names of the final representations from all branches.

Return type:

List[str]

Raises:

ValueError – If the number of filter branches and representations to filter is different. If a standard filter is provided with multiple representations. If no representations are provided for a filter that requires input. If any representations_to_filter element is a string instead of a list.

Notes

Each branch in the pipeline is processed sequentially using apply_filter_sequence.

Examples

>>> # Example of a pipeline with multiple processing branches
>>> from myoverse.datatypes import EMGData
>>> from myoverse.datasets.filters.generic import ApplyFunctionFilter
>>> import numpy as np
>>>
>>> # Create sample data
>>> data = EMGData(np.random.rand(10, 8), sampling_frequency=1000)
>>>
>>> # Define filter branches that perform different operations on the same input
>>> branch1 = [ApplyFunctionFilter(function=np.abs, name="absolute_values")]
>>> branch2 = [ApplyFunctionFilter(function=lambda x: x**2, name="squared_values")]
>>>
>>> # Apply pipeline with two branches
>>> data.apply_filter_pipeline(
>>>     filter_pipeline=[branch1, branch2],
>>>     representations_to_filter=[
>>>         ["input_data"],  # Process branch1 on input_data
>>>         ["input_data"],  # Process branch2 on input_data
>>>     ],
>>> )
>>>
>>> # The results are now available as separate representations
>>> abs_values = data["absolute_values"]
>>> squared_values = data["squared_values"]

apply_filter_sequence(filter_sequence, representations_to_filter=None, keep_individual_filter_steps=True, keep_representation_to_filter=True)[source]#

Applies a sequence of filters to the data sequentially.

Parameters:

filter_sequence (list[FilterBaseClass]) – The sequence of filters to apply.
representations_to_filter (List[str], optional) – A list of representations to filter for the first filter in the sequence. Each filter is responsible for validating and handling its inputs appropriately. For subsequent filters in the sequence, the output of the previous filter is used.
keep_individual_filter_steps (bool) – Whether to keep the results of each filter or not.
keep_representation_to_filter (bool) – Whether to keep the representation(s) to filter or not. If the representation to filter is “Input”, this parameter is ignored.

Returns:

The name of the last representation after applying all filters.

Return type:

str

Raises:

ValueError – If filter_sequence is empty. If representations_to_filter is empty. If representations_to_filter is a string instead of a list.

delete(representation_to_delete)[source]#

Delete both the data and history for a representation.

Parameters:: representation_to_delete (str) – The representation to delete.

delete_data(representation_to_delete)[source]#

Delete data from a representation while keeping its metadata.

This replaces the actual numpy array with a DeletedRepresentation object that contains metadata about the array, saving memory while allowing regeneration when needed.

Parameters:: representation_to_delete (str) – The representation to delete the data from.

delete_history(representation_to_delete)[source]#

Delete the processing history for a representation.

Parameters:: representation_to_delete (str) – The representation to delete the history for.

get_representation_history(representation)[source]#

Returns the history of a representation.

Parameters:: representation (str) – The representation to get the history of.
Returns:: The history of the representation.
Return type:: list[str]

memory_usage()[source]#

Calculate memory usage of each representation.

Returns:: Dictionary with representation names as keys and tuples containing shape as string and memory usage in bytes as values.
Return type:: Dict[str, Tuple[str, int]]

abstractmethod plot(*_, **__)[source]#

Plots the data.

Parameters:

_ (Any)
__ (Any)

plot_graph(title=None)[source]#

Draws the graph of the processed representations.

Parameters:: title (Optional[str], default=None) – Optional title for the graph. If None, no title will be displayed.

save(filename)[source]#

Save the data to a file.

Parameters:: filename (str) – The name of the file to save the data to.

property input_data: ndarray#: Returns the input data.

property is_chunked: Dict[str, bool]#

Returns whether the data is chunked or not.

Returns:: A dictionary where the keys are the representations and the values are whether the data is chunked or not.
Return type:: Dict[str, bool]

property output_representations: Dict[str, ndarray]#: Returns the output representations of the data.

property processed_representations: Dict[str, ndarray]#: Returns the processed representations of the data.

_Data#

This Page