DREAMERDataset¶
- class torcheeg.datasets.DREAMERDataset(mat_path: str = './DREAMER.mat', chunk_size: int = 128, overlap: int = 0, num_channel: int = 14, num_baseline: int = 61, baseline_chunk_size: int = 128, online_transform: None | Callable = None, offline_transform: None | Callable = None, label_transform: None | Callable = None, before_trial: None | Callable = None, after_trial: None | Callable = None, after_session: None | Callable = None, after_subject: None | Callable = None, io_path: None | str = None, io_size: int = 1048576, io_mode: str = 'lmdb', num_worker: int = 0, verbose: bool = True)[source][source]¶
A multi-modal database consisting of electroencephalogram and electrocardiogram signals recorded during affect elicitation by means of audio-visual stimuli. This class generates training samples and test samples according to the given parameters, and caches the generated results in a unified input and output format (IO). The relevant information of the dataset is as follows:
Author: Katsigiannis et al.
Year: 2017
Download URL: https://zenodo.org/record/546113
Reference: Katsigiannis S, Ramzan N. DREAMER: A database for emotion recognition through EEG and ECG signals from wireless low-cost off-the-shelf devices[J]. IEEE journal of biomedical and health informatics, 2017, 22(1): 98-107.
Stimulus: 18 movie clips.
Signals: Electroencephalogram (14 channels at 128Hz), and electrocardiogram (2 channels at 256Hz) of 23 subjects.
Rating: Arousal, valence, like/dislike, dominance, familiarity (all ona scale from 1 to 5).
In order to use this dataset, the download file
DREAMER.mat
is required.An example dataset for CNN-based methods:
from torcheeg.datasets import DREAMERDataset from torcheeg import transforms from torcheeg.datasets.constants.emotion_recognition.dreamer import DREAMER_CHANNEL_LOCATION_DICT dataset = DREAMERDataset(mat_path='./DREAMER.mat', offline_transform=transforms.Compose([ transforms.BandDifferentialEntropy(), transforms.ToGrid(DREAMER_CHANNEL_LOCATION_DICT) ]), online_transform=transforms.ToTensor(), label_transform=transforms.Compose([ transforms.Select('valence'), transforms.Binary(3.0), ])) print(dataset[0]) # EEG signal (torch.Tensor[4, 9, 9]), # coresponding baseline signal (torch.Tensor[4, 9, 9]), # label (int)
Another example dataset for CNN-based methods:
from torcheeg.datasets import DREAMERDataset from torcheeg import transforms dataset = DREAMERDataset(mat_path='./DREAMER.mat', online_transform=transforms.Compose([ transforms.To2d(), transforms.ToTensor() ]), label_transform=transforms.Compose([ transforms.Select(['valence', 'arousal']), transforms.Binary(3.0), transforms.BinariesToCategory() ])) print(dataset[0]) # EEG signal (torch.Tensor[1, 14, 128]), # coresponding baseline signal (torch.Tensor[1, 14, 128]), # label (int)
An example dataset for GNN-based methods:
from torcheeg.datasets import DREAMERDataset from torcheeg import transforms from torcheeg.datasets.constants.emotion_recognition.dreamer import DREAMER_ADJACENCY_MATRIX from torcheeg.transforms.pyg import ToG dataset = DREAMERDataset(mat_path='./DREAMER.mat', online_transform=transforms.Compose([ ToG(DREAMER_ADJACENCY_MATRIX) ]), label_transform=transforms.Compose([ transforms.Select('arousal'), transforms.Binary(3.0) ])) print(dataset[0]) # EEG signal (torch_geometric.data.Data), # coresponding baseline signal (torch_geometric.data.Data), # label (int)
- Parameters:
mat_path (str) – Downloaded data files in pickled matlab formats (default:
'./DREAMER.mat'
)chunk_size (int) – Number of data points included in each EEG chunk as training or test samples. If set to -1, the EEG signal of a trial is used as a sample of a chunk. (default:
128
)overlap (int) – The number of overlapping data points between different chunks when dividing EEG chunks. (default:
0
)num_channel (int) – Number of channels used, of which the first 14 channels are EEG signals. (default:
14
)num_baseline (int) – Number of baseline signal chunks used. (default:
61
)baseline_chunk_size (int) – Number of data points included in each baseline signal chunk. The baseline signal in the DREAMER dataset has a total of 7808 data points. (default:
128
)online_transform (Callable, optional) – The transformation of the EEG signals and baseline EEG signals. The input is a
np.ndarray
, and the ouput is used as the first and second value of each element in the dataset. (default:None
)offline_transform (Callable, optional) – The usage is the same as
online_transform
, but executed before generating IO intermediate results. (default:None
)label_transform (Callable, optional) – The transformation of the label. The input is an information dictionary, and the ouput is used as the third value of each element in the dataset. (default:
None
)before_trial (Callable, optional) – The hook performed on the trial to which the sample belongs. It is performed before the offline transformation and thus typically used to implement context-dependent sample transformations, such as moving averages, etc. The input of this hook function is a 2D EEG signal with shape (number of electrodes, number of data points), whose ideal output shape is also (number of electrodes, number of data points).
after_trial (Callable, optional) – The hook performed on the trial to which the sample belongs. It is performed after the offline transformation and thus typically used to implement context-dependent sample transformations, such as moving averages, etc. The input and output of this hook function should be a sequence of dictionaries representing a sequence of EEG samples. Each dictionary contains two key-value pairs, indexed by
eeg
(the EEG signal matrix) andkey
(the index in the database) respectively.io_path (str) – The path to generated unified data IO, cached as an intermediate result. If set to None, a random path will be generated. (default:
None
)io_size (int) – Maximum size database may grow to; used to size the memory mapping. If database grows larger than
map_size
, an exception will be raised and the user must close and reopen. (default:1048576
)io_mode (str) – Storage mode of EEG signal. When io_mode is set to
lmdb
, TorchEEG provides an efficient database (LMDB) for storing EEG signals. LMDB may not perform well on limited operating systems, where a file system based EEG signal storage is also provided. When io_mode is set topickle
, pickle-based persistence files are used. When io_mode is set tomemory
, memory are used. (default:lmdb
)num_worker (int) – Number of subprocesses to use for data loading. 0 means that the data will be loaded in the main process. (default:
0
)verbose (bool) – Whether to display logs during processing, such as progress bars, etc. (default:
True
)