^{1}

^{*}

^{2}

^{1}

^{3}

^{4}

^{5}

^{1}

^{1}

^{2}

^{3}

^{4}

^{5}

Edited by: Simon C. Warby, Stanford University, USA

Reviewed by: Christian Bénar, Institut National de la Recherche Médicale, France; Alpar S. Lazar, Univesrity of Cambridge, UK

*Correspondence: Piotr J. Durka, Faculty of Physics, University of Warsaw, ul. Pasteura 5, 02-093 Warsaw, Poland

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

We present a complete framework for time-frequency parametrization of EEG transients, based upon matching pursuit (MP) decomposition, applied to the detection of sleep spindles. Ranges of spindles duration (>0.5 s) and frequency (11–16 Hz) are taken directly from their standard definitions. Minimal amplitude is computed from the distribution of the root mean square (RMS) amplitude of the signal within the frequency band of sleep spindles. Detection algorithm depends on the choice of just one free parameter, which is a percentile of this distribution. Performance of detection is assessed on the first cohort/second subset of the Montreal Archive of Sleep Studies (MASS-C1/SS2). Cross-validation performed on the 19 available overnight recordings returned the optimal percentile of the RMS distribution close to 97 in most cases, and the following overall performance measures: sensitivity 0.63 ± 0.06, positive predictive value 0.47 ± 0.08, and Matthews coefficient of correlation 0.51 ± 0.04. These concordances are similar to the results achieved on this database by other automatic methods. Proposed detailed parametrization of sleep spindles within a universal framework, encompassing also other EEG transients, opens new possibilities of high resolution investigation of their relations and detailed characteristics. MP decomposition, selection of relevant structures, and simple creation of EEG profiles used previously for assessment of brain activity of patients in disorders of consciousness are implemented in a freely available software package Svarog (Signal Viewer, Analyzer and Recorder On GPL) with user-friendly, mouse-driven interface for review and analysis of EEG. Svarog can be downloaded from

Sleep spindles are defined in Rechtschaffen and Kales (

Explosion of the applications of computerized signal processing methods resulted in a multitude of automatic detection algorithms. The most effective so far are based upon a common framework, introduced in Schimicek et al. (

EEG is band-pass filtered in the frequency range related to sleep spindles.

Signal from the previous step is subjected to amplitude thresholding in the time domain.

Epochs exceeding the threshold are filtered in the time domain to select those corresponding to sleep spindles.

Contrary to the visual detection by human experts, who concentrate directly and separately on relevant transient structures visible in EEG, each step of this sequential procedure implements only one aspect of the definition, and accumulates the bias from the previous steps. This drawback is the consequence of separate application of filters in the frequency and time domains. This turns our attention to the time-frequency methods of signal processing.

Classically, methods like short-time Fourier transform (STFT) and wavelet transform (WT) are used to compute the distribution of signal's energy in the time-frequency plane (Durka and Blinowska,

Algorithm adapting the parameters automatically to the local content of the analyzed signal was introduced in Mallat and Zhang (

This approach has been successfully applied for the detection and parameterization of EEG transients including sleep spindles in different paradigms, mostly at the University of Warsaw. Additionally, MP-based detection of several types of EEG transients can be efficiently combined into an automatic sleep stager, based explicitly upon the accepted criteria for stages (Malinowska et al.,

Detection of sleep spindles presented in this paper relies on the correspondence of their shape (waxing and waning oscillations) to the Gabor functions used in MP decomposition (Figure

MP was proposed by Mallat and Zhang (_{γ}. In plain English, the gist of the MP procedure can be summarized as follows:

We start by creating a huge, redundant set

From this

The above idea is implemented in an iterative procedure: in each step we find the “best” function, and then subtract it from the signal being decomposed in the following steps.

As for the mathematical description, denoting the function fitted to the signal _{γn}, and the residual left after ^{n}

where 〈·, ·〉 denotes the inner product of signals and | · | the absolute value. As a result we get an approximate expansion:

where

where γ is a set of parameters such that γ = (_{γ}|| = 1.

The procedure is generic. The only major settings correspond to:

quality of the decomposition, regulated mainly by the size of the dictionary

number of iterations

In both cases, higher settings result in higher accuracy.

Size of the dictionary ^{1}

This special construction of the dictionary, ensuring a uniform distribution in the space of inner products, imposes non-uniform distribution of dictionary's functions in the space of their time positions, widths and frequencies (Kuś et al.,

Number of iterations _{γn} in Equation (2) are ordered by decreasing energy. That means that in two different decompositions differing only in the setting of the number of iterations, say 50 and 100, the first 50 waveforms will be the same (with small exceptions if stochastic decomposition was chosen), and iterations 51–100 will contain only structures of energy smaller than contributed by _{γ50}.

Increasing the number of iterations will not improve the quality of fit of any single waveform, so if we are interested in structures of relatively high energy, as is usually the case when looking for structures which are also visible for human expert, it makes no sense to increase

Described above MP decomposition is a purely mathematical procedure. In relation to EEG analysis, bad news are:

Computation of the MP decomposition of a signal is relatively time-consuming even on a modern PC.

Settings of the energy error and number of iterations may require some consideration in case of limited computational resources, as discussed above in Sections 2.1.2 and 2.1.3.

Good news are:

Unlike most of the time-frequency methods of signal processing, setting of parameters is not a tradeoff between different aspects of the quality of decomposition, but a tradeoff between the quality and speed.

MP decomposition is generic, and once performed, the same decomposition of given epoch can be used to investigate the presence of different structures (c.f.

Program computing the actual MP decomposition of given epoch is implemented in C and compiled separately for each platform. It is a command-line program, taking input from a config file and writing output to a binary file containing parameters of the fitted functions (a “book” ^{*}.b). To facilitate its application, we created a wrapper/GUI module for Svarog, which is a multiplatform EEG review system. After installation and configuration of the system (Section 4.4), user can perform MP decompositions of the epoch selected by mouse, setting the decomposition parameters in tabs of the window displayed in Figure

Data comes from the first cohort/second subset of the Montreal Archive of Sleep Studies (MASS-C1/SS2) (O'Reilly et al.,

We based the assessment of efficiency of the detector on the markings with the accuracy of the EEG sampling, as proposed in O'Reilly et al. (in revision). In such an approach, at each sample (in our case 256 samples per second), there are four well-defined outcomes of comparison of expert's and detector's scorings: spindle present according to both expert and detector (true positives;

Positive predictive value^{2}

Matthews coefficient of correlation (MCC):

where

Cohens κ:

where _{e} is the probability of random agreement defined as:

F_{1}-score:

Division between the purely mathematical MP decomposition of signals and further neuroscience research is clearly reflected in the structure of the Svarog software package. The first step, briefly covered in Section 2.1, consists of a generic approximation of the signal by a linear sum of Gabor functions. The second step, which is selection of the structures corresponding to sleep spindles, constitutes the main topic of this article.

MP offers explicit parameterization of signal structures in terms of their time and frequency positions, widths and amplitudes. Detection of sleep spindles within the proposed framework can be perceived as filtering out irrelevant structures from a database containing all the waveforms fitted by MP to a given signal epoch. Settings of the filter can be directly based upon the classical definition(s) mentioned in the Introduction. We choose frequency range 11–16 Hz and duration exceeding 0.5 s. Duration and time center of each detected spindle are returned explicitly by the MP algorithm, as parameters

Due to the lack of a precise definition of the minimum amplitude for spindles, one can either adapt a fixed threshold (e.g., Schimicek et al.,

where _{RMS} is the percentile of the mentioned RMS distribution, chosen to maximize resulting MCC.

As described in Section 2.4, the minimal amplitude of candidate waveform is a free parameter in the proposed detector of sleep spindles. In order to have a complete picture of the detector performance on the current dataset, in Figure

Figure

A common pitfall in the evaluation of the algorithms detecting sleep spindles is their explicit optimization for a particular dataset, often the same as the one used for presenting the performance of resulting algorithm. It is also a common problem in evaluation of detection algorithms, and the standard solution used in machine learning is called cross-validation.

For the evaluation of performance of the proposed method, we implement the following cross-validation procedure, related to the only parameter not taken directly from the definition of sleep spindles, which is the minimal amplitude expressed in terms of the percentile of RMS distribution in the frequency range of sleep spindles:

Randomly divide the available recordings in two disjoint subsets, further called the training set and the validation set.

Compute the optimal percentile for the training set.

Evaluate the performance on the validation set.

Repeat steps (1–3).

By averaging resulting performance measures over different random divisions of the available dataset we obtain an estimate of the average performance of the procedure on “unseen” data. This estimate tends to be a bit lower than the overall performance computed and estimated on the whole dataset at once.

We performed 100 iterations of the cross-validation procedure, each time randomly choosing 14 recordings for the training set used to compute the optimal RMS percentile. Then these 14 percentiles _{RMS}, optimal for each of the recording separately, were averaged. The resulting average threshold was applied to find the minimal spindle amplitudes for all the remaining 5 recordings. Figure

Sensitivity | 0.63 | 0.59 | 0.68 | 0.63 | 0.06 |

PPV | 0.47 | 0.42 | 0.52 | 0.47 | 0.08 |

MCC | 0.52 | 0.49 | 0.53 | 0.51 | 0.04 |

Cohen kappa | 0.49 | 0.46 | 0.52 | 0.49 | 0.05 |

F_{1}-score |
0.54 | 0.51 | 0.56 | 0.53 | 0.04 |

Proposed approach offers precise detection of time centers and durations of sleep spindles and other transients. Apart from these, MP decomposition provides also an explicit and high resolution parameterization of their frequencies, amplitudes and phases. This opens a simple access to detailed information on the pattern of their occurrences across the whole analyzed recording, including:

exact time occurrences of each detected structure with information about amplitude of each detected spindle.

number of structures per epoch (in sleep analysis this is traditionally 20 or 30 s).

percent of the epoch's time occupied by selected transients.

Although the last parameter has not been used for sleep spindles so far, all these reports are presented for demonstration in the three upper panels of

Sleep spindles are not the only EEG transients which can be effectively detected and parameterized by means of proposed approach. Another classic example of transient structures crucial for assessment of the sleep process are slow waves (Durka et al.,

Figure

These profiles can be used for investigating several features of EEG, previously assessed by different specially constructed algorithms, or by visual inspection. For example:

Report in the lower panel of Figure

Profiles for these and other structures were used for assessment of the brain activity of patients in different states of disorders of consciousness (Malinowska et al.,

As mentioned in Section 2.1, in each step of the MP algorithm we compute inner products of all the functions from the dictionary with the signal (or the residuum left from previous iterations). Implemented directly, this would typically result in millions of inner products, each computed on thousands of samples. Such massive computations impose a significant burden even for modern computers. Fortunately, it is possible to decrease it significantly with mathematical and programming tricks. The former, implemented in the current version of the MP algorithm used for computations in this article and available together with Svarog from

MP decomposition is performed only once per each analyzed signal, and as such needs not to be interactive. Using one such general decomposition, we can investigate any structures potentially present in the signal (Section 4.3) in a comfortably interactive mode. Results from one channel of an overnight recording like the one presented in Figure

There is still room for significant speed improvements, in the optimization of code (e.g., multithreading or using GPUs) as well as in the adjustments of the decomposition parameters to a particular problem. As an example of the latter we may quote an online procedure for detection of epileptic seizures in commercial EEG software by Persyst (

Reported performance of sleep spindle detectors depends both on the properties of the detector and on the quality of experts scores. Therefore, the quantitative comparison of detectors is possible only on the same database of EEG recordings and scorings, otherwise the comparison is rather qualitative. It is especially so if the parameters of the detector are tuned to maximize the performance for a given dataset. Another problem in comparison between the results reported in literature is that various authors define the correct detection in different ways via the “window based” type of comparison—mainly in respect to the criteria defining the overlap between detectors and experts scores. We used “signal-sample-based” assessment of performance, since we find it much less ambiguous. In general, the values obtained in “signal-sample-based” type of comparison are more conservative than those obtained in “window based” comparison, as was demonstrated in O'Reilly et al. (in revision). Unfortunately, “window based” comparison is the most common and for a long time was the only one considered for assessing the performance of spindles detection presented in literature. To give a general background we cite below some of the results.

For example, one of the first automated detection method with fixed amplitude threshold (Schimicek et al., ^{3}

A more direct comparison of the detector presented in this work can be made with the six automatic detectors, known from publications, reimplmented and tested in Warby et al. (^{4}_{1}-score is close to the maximum performance for the auto group consensus. Such result would indicate that the proposed detector is well balanced and close to optimal among the automated detectors, but we have to keep in mind that we compare results for different datasets.

The most meaningful and direct comparison can be made with the four detectors tested in O'Reilly et al. (in revision), since they were tested on exactly the same data set, with same expert scoring, and using the same “signal-sample-based” type of comparison. For the ease of comparison, in Figure

In the context of a universal parameterization of EEG transients (Durka,

We believe that the availability of the free software and exemplary description of a framework for detection of sleep spindles paves the way to novel and creative applications of this high-resolution parametrization, to a large extent compatible with the tradition of visual analysis.

Complete software package (with source code) used in this study for computing MP decompositions and generating Figure

Polysomnograms and human scoring of sleep spindles used in this study come from MASS database and can be downloaded from

PD has proposed and designed major steps of MP parameterization of EEG transients and detection of spindles, supervised and contributed to the development of the software, designed the current study and wrote most of the text. PR has written the interactive plugin for detection of structures and display of reports from MP decompositions, fixed the Svarog interface to MP and bugs found during preparation of this study, and consulted mathematical aspects of MP. UM contributed to tests of the software, data analysis and interpretation, drafting of the work and reviewing the manuscript. MZ contributed to tests of the software, tested several detection schemes and performed large part of data analysis and comparisons with visual detections. COR performed MP decompositions on MASS database, performed analyzes, and contributed in writing and reviewing the manuscript. JZ adjusted details of the detection algorithm, supervised the comparison with visual detection, performed cross-validation analyzes, and contributed in writing and reviewing the manuscript.

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

^{1}ϵ relates to the maximum distance between two neighboring functions available for decomposition. The distance between two Gabor functions _{1} and _{2} from the dictionary _{1}|_{2}〉 related to the energy as

Dictionary is constructed in such a way that this distance is kept uniform across the neighboring functions. When fitting the dictionary's functions to a signal, the maximum error occurs when a signals structure falls exactly in between two functions available in the dictionary. In such a dictionary, this error will not exceed the distance between neighboring functions from the dictionary. In energy units it will be _{1}, _{2})^{2}—the (maximum) “energy error” ϵ.

^{2}PPV is related to False Discovery Rate as: PPV = 1 − FDR.

^{3}One should be careful on reading of this paper since the authors call false-positive rate what is usually referred to as false detection rate. False positive rate is generally considered as

^{4}Precision is another name for PPV and recall for sensitivity.