HMCDataset¶
- class torcheeg.datasets.HMCDataset(root_path: str = './HMC/recordings', channels: List = ['EEG F4-M1', 'EEG C4-M1', 'EEG O2-M1', 'EEG C3-M2'], l_freq: float = 0.5, h_freq: float = 30, sfreq: int = 100, online_transform: None | Callable = None, offline_transform: None | Callable = None, label_transform: None | Callable = None, io_path: None | str = None, io_size: int = 1048576, io_mode: str = 'lmdb', num_worker: int = 0, verbose: bool = True, **kwargs)[source][source]¶
Haaglanden Medisch Centrum sleep staging database (HMC) is a widely-used sleep stage detection dataset. This class generates training samples and test samples according to the given parameters, and caches the generated results in a unified input and output format (IO). The relevant information of the dataset is as follows:
Author: Diego et al.
Year: 2022
Download URL: https://physionet.org/content/hmc-sleep-staging/1.1/
Reference: Alvarez-Estevez D, Rijsman R M. Inter-database validation of a deep learning approach for automatic sleep scoring[J]. PloS one, 2021, 16(8): e0256111.
Signals: A collection of 151 whole-night polysomnographic (PSG) sleep recordings (85 Male, 66 Female, mean Age of 53.9 ± 15.4) collected during 2018 at the Haaglanden Medisch Centrum (HMC, The Netherlands) sleep center. The dataset contains electroencephalographic (EEG), electrooculographic (EOG), chin electromyographic (EMG), and electrocardiographic (ECG) activity (‘ECG’, ‘EEG C3-M2’, ‘EEG C4-M1’, ‘EEG F4-M1’, ‘EEG O2-M1’, ‘EMG chin’, ‘EOG E1-M2’, ‘EOG E2-M2’).
Rating: Sleep stages were annotated in 30 second contiguous intervals (Sleep stage W, Sleep stage N1, Sleep stage N2, Sleep stage N3, Sleep stage R, Lights off@@EEG F4-A1).
In order to use this dataset, the following file structure is required:
HMC/ └── recordings/ ├── SN001.edf ├── SN001.sleepscoring.edf ├── SN002.edf ├── SN002.sleepscoring.edf └── ...An example dataset:
dataset = HMCDataset(root_path='./HMC/recordings', sfreq=100, channels=['EEG F4-M1', 'EEG C4-M1', 'EEG O2-M1', 'EEG C3-M2'], label_transform=transforms.Compose([ transforms.Select('label'), transforms.Mapping({'Sleep stage W': 0, 'Sleep stage N1': 1, 'Sleep stage N2': 2, 'Sleep stage N3': 3, 'Sleep stage R': 4, 'Lights off@@EEG F4-A1': 0}) ]), online_transform=transforms.Compose([ transforms.MeanStdNormalize(), transforms.ToTensor(), ]), ) print(dataset[0]) # EEG signal (torch.Tensor[4, 3000]), # label (int)
- Parameters:
root_path (str) – Root path of the HMC dataset containing .edf and .sleepscoring.edf files. (default:
'./HMC/recordings')channels (list) – List of EEG channels to use. Available channels are ‘ECG’, ‘EEG C3-M2’, ‘EEG C4-M1’, ‘EEG F4-M1’, ‘EEG O2-M1’, ‘EMG chin’, ‘EOG E1-M2’, ‘EOG E2-M2’. (default:
['EEG F4-M1', 'EEG C4-M1', 'EEG O2-M1', 'EEG C3-M2'])l_freq (float) – Low cut-off frequency in Hz. (default:
0.5)h_freq (float) – High cut-off frequency in Hz. (default:
30)sfreq (int) – The sampling frequency to resample the signal to in Hz. (default:
100)online_transform (Callable, optional) – The transformation of the EEG signals. The input is a
np.ndarray, and the ouput is used as the first value of each element in the dataset. (default:None)offline_transform (Callable, optional) – The usage is the same as
online_transform, but executed before generating IO intermediate results. (default:None)label_transform (Callable, optional) – The transformation of the label. The input is an information dictionary, and the ouput is used as the second value of each element in the dataset. (default:
None)io_path (str, optional) – The path to generated unified data IO, cached as an intermediate result. If set to None, a random path will be generated. (default:
None)io_size (int) – Maximum size database may grow to; used to size the memory mapping. If database grows larger than
map_size, an exception will be raised and the user must close and reopen. (default:1048576)io_mode (str) – Storage mode of EEG signal. When io_mode is set to
lmdb, TorchEEG provides an efficient database (LMDB) for storing EEG signals. When io_mode is set topickle, pickle-based persistence files are used. When io_mode is set tomemory, memory are used. (default:lmdb)num_worker (int) – Number of subprocesses to use for data loading. 0 means that the data will be loaded in the main process. (default:
0)verbose (bool) – Whether to display logs during processing, such as progress bars, etc. (default:
True)