^{1}

^{*}

^{2}

^{1}

^{2}

Edited by: Shuhei Yamaguchi, Shimane University, Japan

Reviewed by: Juan Zhou, Duke-NUS Medical School, Singapore; Zhen Yuan, University of Macau, China

*Correspondence: Issaku Kawashima

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Mind-wandering (MW), task-unrelated thought, has been examined by researchers in an increasing number of articles using models to predict whether subjects are in MW, using numerous physiological variables. However, these models are not applicable in general situations. Moreover, they output only binary classification. The current study suggests that the combination of electroencephalogram (EEG) variables and non-linear regression modeling can be a good indicator of MW intensity. We recorded EEGs of 50 subjects during the performance of a Sustained Attention to Response Task, including a thought sampling probe that inquired the focus of attention. We calculated the power and coherence value and prepared 35 patterns of variable combinations and applied Support Vector machine Regression (SVR) to them. Finally, we chose four SVR models: two of them non-linear models and the others linear models; two of the four models are composed of a limited number of electrodes to satisfy model usefulness. Examination using the held-out data indicated that all models had robust predictive precision and provided significantly better estimations than a linear regression model using single electrode EEG variables. Furthermore, in limited electrode condition, non-linear SVR model showed significantly better precision than linear SVR model. The method proposed in this study helps investigations into MW in various little-examined situations. Further, by measuring MW with a high temporal resolution EEG, unclear aspects of MW, such as time series variation, are expected to be revealed. Furthermore, our suggestion that a few electrodes can also predict MW contributes to the development of neuro-feedback studies.

Mind-wandering (MW; Smallwood and Schooler,

Therefore, MW is a critical theme for various psychiatric problems, including depression, and further research is needed. For investigating MW, it is important to know how to evaluate one's MW. Currently, to measure MW, many researches use thought sampling during a task. In this method, subjects complete a tedious task, such as a Sustained Attention to Response Task (SART; Robertson et al.,

However, both methods have some limitations. First, the self-caught method is affected by the meta-awareness ability of subjects. People cannot generally monitor whether they are in MW, and awareness of MW is intermittent (Schooler et al.,

Recently, reports predicting the existence of MW from biological multi-variance methods are increasing. Mittner et al. (

However, these studies also have some limitations when evaluating MW. First, the models proposed by these studies only provide a binary estimation and are not able to refer to the “deepness” of MW. Previous studies claim that MW is not dichotomous but a phenomenon with continuous intensity (Schad et al.,

Electroencephalogram (EEG) measurement is easy, has few limitations in measurement circumstance, and expected to have an adequate amount of information. The EEG indicator can reveal the nature of MW in uninvestigated conditions, such as trying to sleep or meditate. Further, the EEG model is useful for neuro-feedback. If an EEG model requiring a short number of electrodes is obtained, a simplified portable EEG device can provide feedback to one's MW. The model may enable subjects to take mobile EEG feedback devices home like Zich et al. (

Some previous studies reported EEG features associated with MW, and many scholars investigated whether EEG changes represent DMN activity. Scheeringa et al. (

In addition to midline areas, studies by Braboszcz and Delorme, (2011) and Berkovich-Ohana et al. (

As discussed above, previous research implies that mainly the DMN and ECN domains relate to MW occurrence; however, considering the inconsistency of the frequency bands reported, EEG features indicate that the MW state may appear in a wide frequency area. Further, considering that the relation between ECN activity and MW varies according to the intensity of MW, the correlation between an ECN activity and MW may be non-linear.

In the present study, we aim to demonstrate what kind of regression model can estimate MW deepness. We hypothesize that multiple EEG variables combination predicts MW better than single variable considering the association of several brain areas and various frequency bands with MW. Furthermore, we propose that non-linear models are more suitable than linear ones owing to the complex relation between ECN and MW.

We fit some regression models, predicting the intensity of MW obtained from probe-caught thought sampling with multiple EEG variables and Support Vector machine Regression (SVR algorithm. SVR can advantageously deal with high dimensional data and provides not only a linear model but also a non-linear one. Few studies try to predict subjective reports from neural variables with SVR. Hoexter et al. (

We called for participants using posters at Waseda University, and 50 people participated in the experiment. We set two exclusion criteria for the analysis: first, two people who scored more than 2

The study was approved by the Waseda University Academic Research Ethical Review Committee, and all participants provided written informed consent.

After informed consent for participation was obtained, we assessed the subjects for depression symptoms by the Center for Epidemiologic Studies Depression Scale (CES-D; Radloff,

We subsequently introduced three tasks for subjects and confirmed their understanding with some practice. After EEG electrodes were attached to them, they completed two tasks and rested for approximately 10 min; one task remained to be performed.

We acquired EEG data during three tasks. Each task included measures lasting 14 min before and after a 30-s resting state, except for the time of thought sampling and presentation of instructions. Task 1 required subjects to tap their finger, and task 2 was an oddball task. However, we do not report them in this article.

Task 3, described in this study, modeled SART (Figure

The procedure of Task 3.

To confirm that the reports of the MW intensity are valid, we investigated whether the behavioral data and off-task reports correlated. We adopted a static method that was used in a previous study (Kucyi et al.,

We recorded the EEG data using the Geodesics EEG system (Electrical Geodesics Inc.) and 17 electrodes (F3, F4, F7, F8, Fz, T3, T4, TP9, TP10, P5, P6, P9, P10, Pz, O1, O2, and Oz), with a 250 Hz sampling rate, referenced to the Cz electrode. Impedance was kept under 50 kohm as per the recommendation of Electrical Geodesic Inc. We filtered the data using a 0.3–70 Hz band-pass filter and a 50 Hz notch filter. This filtering process was completed using Waveform Tools, Net Station Version 4.2 (Electrical Geodesics Inc.).

The EEG data was divided into 1-s epochs, and the ones contaminated with artifacts, such as eye movement, blinks, and body movement, were removed. The artifact detection algorithm was the same as that provided by Waveform Tools. All epochs were Fourier-transformed, and the mean power value and the coherence between each pair of electrodes in eight frequency bands (Kubicki et al.,

We then divided all participants into two groups: one provided training dataset, and the other provided test dataset including those of one-third of all subjects who were not used for model construction but for verification. Finally, we removed the sections in which any of the EEG data (i.e., the mean power value and the coherence) having a Z-score more than 5 was clubbed as an outlier. Using the abovementioned process, we obtained training dataset including 440 data samples and test dataset including 187 data samples (note that many sections were totally contaminated by artifacts and removed.) One data sample included 1,224 predictors [(17 electrodes for power values + _{17}C_{2} electrodes pairs for coherence) × 8 frequency bands] and one response variable, indicating the intensity of MW, i.e., the target to predict. Both datasets were scaled by the average and variance of training dataset.

As the collected data included too many predictors and due to concern regarding over-fitting (severe deterioration of prediction accuracy when a model is applied to novel datasets), the predictors needed to be selected. The current study employed a filter technique with Pearson's correlation coefficient, which is applicable to the model fitting the algorithms we used. We screened out predictors whose absolute value of correlation coefficient |

Support Vector machine Regression (SVR) is based on a linear regression function:

_{i} denotes

where,

_{i} is _{i}, _{j}, which is used to solve above problem (for detail, see Hastie et al., _{i}, _{j}), SVR provides non-linear regression models. The present study used linear SVR and Radial Basis Function (RBF) kernel SVR:

γ is also user defined parameter and regulates model simplicity.

We determined ϵ, C, and γ using a grid search, which tries all patterns of parameter candidates to make models and adopts the best prediction accuracy combination. The grid search method uses cross-validation for the presumption of precision. In this approach, training datasets are divided into some (in this study: 10) groups in as equally as possible; one group is set as the test dataset in cross-validation; and the others are set as training dataset in cross-validation. After a model fitting with training dataset in cross-validation, we evaluated the mean squared error (

We applied single regression analysis to the training dataset of the single predictor pattern and two SVR algorithms to the other predictor-set patterns and estimated each cross-validation

While Model 1 was expected to provide as high a level of accuracy as possible and meet the demands of basic MW research, Model 2 was adapted to situations using limited-measurement environments, such as neuro-feedback at home and expected to show less but close score to Model 1. Models 3 and 4 were created to examine if non-linear model predicts MW more precisely than linear model, and Model 5 confirmed multiple regression models (Models 1–4) as it had better precision than previously proposed single regression models. For these comparisons, we examined the significant difference in

First, to provide the validation of the reports of MW intensity, we used a Wilcoxon signed rank test. The mean within-subject correlation between RT variance and reported MW intensity was positive and significantly >0 (

The coherence between electrode Pz and O1 in the beta-3 band showed the strongest correlation with response values (|

Cross-validation scores of Support Vector machine Regression (SVR) models on each threshold and number of electrodes.

The

0.000 | 17 | 0.858 | 0.0005 | 0.500 | 0.500 | 0.850 | 0.0005 | 0.250 | 0.125 |

0.010 | 17 | 0.852 | 0.0005 | 0.500 | 0.250 | 0.843 | 0.0005 | 0.250 | 0.125 |

0.020 | 17 | 0.846 | 0.0005 | 0.500 | 0.250 | 0.838 | 0.0005 | 0.250 | 0.125 |

0.030 | 17 | 0.837 | 0.0005 | 0.500 | 0.250 | 0.832 | 0.0005 | 0.250 | 0.125 |

0.040 | 17 | 0.830 | 0.0005 | 0.500 | 0.250 | 0.826 | 0.0005 | 0.250 | 0.125 |

0.050 | 17 | 0.828 | 0.0010 | 0.500 | 0.250 | 0.820 | 0.0010 | 0.250 | 0.063 |

0.060 | 17 | 0.826 | 0.0010 | 0.500 | 0.250 | 0.819 | 0.0010 | 0.250 | 0.063 |

0.070 | 17 | 0.821 | 0.0010 | 0.500 | 0.250 | 0.815 | 0.0010 | 0.250 | 0.063 |

0.080 | 17 | 0.818 | 0.0010 | 0.500 | 0.125 | 0.815 | 0.0010 | 0.250 | 0.063 |

0.090 | 17 | 0.809 | 0.0010 | 0.500 | 0.125 | 0.809 | 0.0010 | 0.500 | 0.125 |

0.100 | 17 | 0.805 | 0.0010 | 0.500 | 0.125 | 0.805 | 0.0010 | 0.500 | 0.125 |

0.110 | 17 | 0.811 | 0.0020 | 0.500 | 0.125 | 0.798 | 0.0020 | 0.500 | 0.063 |

0.120 | 17 | 0.804 | 0.0020 | 0.500 | 0.125 | 0.796 | 0.0020 | 0.500 | 0.063 |

0.130 | 17 | 0.794 | 0.0020 | 0.500 | 0.125 | 0.790 | 0.0020 | 0.500 | 0.063 |

0.140 | 17 | 0.787 | 0.0020 | 0.500 | 0.125 | 0.785 | 0.0020 | 0.500 | 0.063 |

0.150 | 17 | 0.779 | 0.0020 | 0.500 | 0.125 | 0.780 | 0.0020 | 0.500 | 0.125 |

0.160 | 17 | 0.790 | 0.0039 | 0.500 | 0.125 | 0.776 | 0.0039 | 0.500 | 0.063 |

0.170 | 17 | 0.770 | 0.0039 | 0.500 | 0.125 | 0.763 | 0.0039 | 0.500 | 0.063 |

0.180 | 17 | 0.759 | 0.0039 | 0.500 | 0.125 | 0.761 | 0.0039 | 0.125 | 0.063 |

0.190 | 17 | 0.762 | 0.0039 | 0.500 | 0.125 | 0.762 | 0.0039 | 0.500 | 0.063 |

0.200 | 17 | 0.764 | 0.0078 | 0.500 | 0.125 | 0.755 | 0.0078 | 0.250 | 0.063 |

0.210 | 16 | 0.760 | 0.0078 | 0.500 | 0.125 | 0.746 | 0.0078 | 2.000 | 2.000 |

0.230 | 15 | 0.775 | 0.0156 | 0.500 | 0.125 | 0.743 | 0.0156 | 2.000 | 2.000 |

0.240 | 14 | 0.772 | 0.0156 | 0.500 | 0.125 | 0.745 | 0.0156 | 0.016 | 0.063 |

0.250 | 14 | 0.759 | 0.0156 | 0.500 | 0.125 | 0.745 | 0.0156 | 0.250 | 0.063 |

0.260 | 11 | 0.770 | 0.0313 | 0.250 | 0.125 | 0.747 | 0.0313 | 0.125 | 0.063 |

0.270 | 10 | 0.767 | 0.0313 | 2.000 | 4.000 | 0.756 | 0.0313 | 0.125 | 0.063 |

0.290 | 5 | 0.811 | 0.1250 | 0.500 | 0.063 | 0.771 | 0.1250 | 2.000 | 1.000 |

0.300 | 4 | 0.822 | 0.1250 | 1.000 | 0.250 | 0.789 | 0.1250 | 1.000 | 0.125 |

0.310 | 4 | 0.820 | 0.2500 | 1.000 | 0.500 | 0.782 | 0.2500 | 1.000 | 0.063 |

0.320 | 4 | 0.823 | 0.2500 | 2.000 | 1.000 | 0.783 | 0.2500 | 1.000 | 0.250 |

0.330 | 3 | 0.815 | 0.5000 | 0.500 | 0.250 | 0.787 | 0.5000 | 0.500 | 0.125 |

0.340 | 2 | 0.841 | 1.0000 | 0.500 | 0.125 | 0.833 | 1.0000 | 0.500 | 0.500 |

The set of selected features in Model 1 and 3

We then investigated the precision of the five models with test dataset. The correlation coefficients between estimated values and measured values were

The results of Pearson's correlation test and

Model 1 (RBF, full electrodes) | 0.54 | 1.02E-15 | 3.47 | 0.00026 |

Model 2 (RBF, limited electrodes) | 0.49 | 8.40E-13 | 2.48 | 0.0067 |

Model 3 (linear, full electrodes) | 0.51 | 9.74E-14 | 2.53 | 0.0057 |

Model 4 (linear, limited electrodes) | 0.39 | 4.79E-08 | 0.56 | 0.29 |

Model 5 (single electrodes) | 0.35 | 1.18E-06 | – | − |

The result of

Model 1 vs. Model 3 (full electrodes, RBF vs. linear) | 1.08 | 0.14 |

Model 2 vs. Model 4 (limited electrodes, RBF vs. linear) | 3.49 | 0.00024 |

Model 1 vs. Model 2 (RBF, full vs. limited electrodes) | 1.27 | 0.10 |

Model 3 vs. Model 4 (linear, full vs. limited electrodes) | 2.98 | 0.0014 |

The aim of this study was to prove that a model with multiple EEG variables and non-linear regression estimated MW intensity better than single variable or linear models. First, we confirmed that the RT variance correlates to self-reported MW intensity as shown in previous research and validated the reported MW score. Then, we prepared a combination of patterns of predictors and fitted models by SVR. Finally, using prediction accuracy estimated by cross-validation and the number of used electrodes, we proposed five models: Models 1 and 2 are non-linear models, Models 3 and 4 are linear models, Models 2 and 4 use restricted number of electrodes, and Model 5 is a single regression model. All models showed robustness, and Model 1–3 presented higher accuracy than similar SVR-using studies (Hoexter et al.,

The variable indicating the highest correlation coefficient with response values was the beta 3 coherence between the parietal midline area (Pz) and the occipital area (O1). Previous research suggests that EEG over the midline area reflects DMN activity (this variable is suspected to relate to DMN). In addition to this variable, Models 2 and 4 include beta 1 and beta 2 activities over the lateral prefrontal area and beta 1 EEG over the parietal area. Considering that both areas are known to be a part of the ECN (Seeley et al.,

Significantly higher accuracies of Models 1–4 compared to Model 5 partially indicate the validity to use multi-variate regression algorithms for estimation of MW intensity from EEG data. Models 1 and 3 showed no significant differences in their precision, and the suitability of the non-linear regression algorithm over the linear one was not confirmed. However, when the number of electrodes was limited, the non-linear model (Model 2) indicated better accuracy than the linear model (Model 4). It is seemed that Model 5 predicted the intensity of MW from DMN activity, and Model 2 and 4 used additional ECN activity. The Non-linear relationship between ECN activity and MW could be the possible cause for better accuracy in Model 2 than in Model 5, but not in Model 4. However, Models 1 and 3 indicated significantly higher precision than Model 5 since these models seemed to additionally use the brain activity involving pressing the button or the processing of numbers for prediction, and importance of ECN activity for prediction might be relatively small for them.

This research has some limitations. First, all subjects were young (averaging 21.77 years), and it is not clear whether the proposed models work on older people. Previous research indicates that aging decreases MW frequency during tasks. Zavagnin et al. (

As advanced research, a prediction model focused on the MW with the strict definition is worth investigating. The present study used probes including a questioning probe asking where the attention was focused on. The same probes were used in previous researches in which prediction model was created from physiological measures (Blanchard et al.,

In conclusion, we illustrated that non-linear regression algorithm with multiple EEG variables estimates MW intensity well. A prediction by EEG enabled us to evaluate intensity of MW in high temporal resolution and observe uninvestigated aspects of MW, such as time-series variation. Moreover, although future research is required, MW estimation by EEG might be applicable to various situations. Our proposed method is expected to clarify the nature of MW in various little-examined situations, such as those involving attempts to sleep or meditate. Further, we demonstrated that EEG data from a few electrodes can also precisely estimate the intensity of MW and contribute to the development of neuro-feedback studies.

This study was carried out in accordance with the recommendations of Waseda University Academic Research Ethical Review Committee with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the Waseda University Academic Research Ethical Review Committee.

IK designed the work and acquired, analyzed, and interpreted data for the work. IK drafted the work. HK substantially contributes to the design of the work and interpret data for the work. HK revised the draft critically for important intellectual content. IK and HK approves of the version to be published and agrees to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

We thank Toru Takahashi and Kaori Usui, Waseda University, for their assistance with preparing our experiment. We are grateful to Keiko Momose Ph.D., Waseda University, for lending her expertise on EEG recording and preprocessing. We appreciate Enago for their editing services. This work was supported by Waseda University Ibuka Funding for “Human Science Research Project Associating Oriental Medicine” and Waseda University Grants for Special Research Projects (2016K-306).