P2018Dataset¶

class torcheeg.datasets.P2018Dataset(root_path: str = './P2018/training/', channels: List = ['F3-M2', 'F4-M1', 'C3-M2', 'C4-M1', 'O1-M2', 'O2-M1'], l_freq: float = 0.5, h_freq: float = 30, sfreq: int = 100, online_transform: None | Callable = None, offline_transform: None | Callable = None, label_transform: None | Callable = None, io_path: None | str = None, io_size: int = 1048576, io_mode: str = 'lmdb', num_worker: int = 0, verbose: bool = True, **kwargs)[source][source]¶

The PhysioNet/Computing in Cardiology Challenge 2018 (P2018), is a widely-used sleep stage detection dataset. This class generates training samples and test samples according to the given parameters, and caches the generated results in a unified input and output format (IO). The relevant information of the dataset is as follows:

Author: Mohammad et al.
Year: 2018
Download URL: https://physionet.org/content/challenge-2018/1.0.0/
Reference: Ghassemi M M, Moody B E, Lehman L W H, et al. You snooze, you win: the physionet/computing in cardiology challenge 2018[C]//2018 Computing in Cardiology Conference (CinC). IEEE, 2018, 45: 1-4.
Signals: 1,985 subjects which were monitored at an MGH sleep laboratory. The data were partitioned into balanced training (n = 994), and test sets (n = 989). The subjects had a variety of physiological signals recorded as they slept through the night including: electroencephalography (EEG), electrooculography (EOG), electromyography (EMG), electrocardiology (EKG), and oxygen saturation (SaO2). (‘ABD’, ‘AIRFLOW’, ‘C3-M2’, ‘C4-M1’, ‘CHEST’, ‘Chin1-Chin2’, ‘E1-M2’, ‘ECG’, ‘F3-M2’, ‘F4-M1’, ‘O1-M2’, ‘O2-M1’, ‘SaO2’).
Rating: Sleep stages were annotated in 30 second contiguous intervals (Sleep stage W, Sleep stage N1, Sleep stage N2, Sleep stage N3, Sleep stage R, Lights off@@EEG F4-A1).

In order to use this dataset, the following file structure is required:

P2018/
└── training/
    ├── tr03-0005
    ├── tr03-0029
    ├── tr03-0052
    ├── tr03-0061
    └── ...

An example dataset:

dataset = P2018Dataset(root_path='./P2018/training/', sfreq=100,
                   channels=['F3-M2', 'F4-M1', 'C3-M2', 'C4-M1', 'O1-M2', 'O2-M1'],
                   label_transform=transforms.Compose([
                       transforms.Select('label'),
                       transforms.Mapping({'Sleep stage W': 0,
                                           'Sleep stage N1': 1,
                                           'Sleep stage N2': 2,
                                           'Sleep stage N3': 3,
                                           'Sleep stage R': 4,
                                           'Lights off@@EEG F4-A1': 0})
                   ]),
                   online_transform=transforms.Compose([
                       transforms.MeanStdNormalize(),
                       transforms.ToTensor(),
                   ]),
                   )
print(dataset[0])
# EEG signal (torch.Tensor[6, 3000]),
# label (int)

Parameters:

root_path (str) – Root path of the P2018 dataset. (default: './P2018/training/')
channels (list) – List of EEG channels to use. Available channels are ‘ABD’, ‘AIRFLOW’, ‘C3-M2’, ‘C4-M1’, ‘CHEST’, ‘Chin1-Chin2’, ‘E1-M2’, ‘ECG’, ‘F3-M2’, ‘F4-M1’, ‘O1-M2’, ‘O2-M1’, ‘SaO2’. (default: ['F3-M2', 'F4-M1', 'C3-M2', 'C4-M1', 'O1-M2', 'O2-M1'])
l_freq (float) – Low cut-off frequency in Hz. (default: 0.5)
h_freq (float) – High cut-off frequency in Hz. (default: 30)
sfreq (int) – The sampling frequency to resample the signal to in Hz. (default: 100)
online_transform (Callable, optional) – The transformation of the EEG signals. The input is a np.ndarray, and the ouput is used as the first value of each element in the dataset. (default: None)
offline_transform (Callable, optional) – The usage is the same as online_transform, but executed before generating IO intermediate results. (default: None)
label_transform (Callable, optional) – The transformation of the label. The input is an information dictionary, and the ouput is used as the second value of each element in the dataset. (default: None)
io_path (str, optional) – The path to generated unified data IO, cached as an intermediate result. If set to None, a random path will be generated. (default: None)
io_size (int) – Maximum size database may grow to; used to size the memory mapping. If database grows larger than map_size, an exception will be raised and the user must close and reopen. (default: 1048576)
io_mode (str) – Storage mode of EEG signal. When io_mode is set to lmdb, TorchEEG provides an efficient database (LMDB) for storing EEG signals. When io_mode is set to pickle, pickle-based persistence files are used. When io_mode is set to memory, memory are used. (default: lmdb)
num_worker (int) – Number of subprocesses to use for data loading. 0 means that the data will be loaded in the main process. (default: 0)
verbose (bool) – Whether to display logs during processing, such as progress bars, etc. (default: True)

P2018Dataset¶

Docs

Tutorials

Resources