Human cognitive performance degrades with ageing. Studies have shown cognitive training, especially electroencephalography (EEG)-based brain-computer interface (BCI) and neurofeedback can improve
cognition in elderly. Accurate attention detection is the requisite for successful implementation of these BCI
systems. While most studies use EEG spectral features for attention detection, the need for a more effective feature learning technique arises. In this study, we addressed this intention through a deep convolutional neural network (CNN), which has been successfully deployed in computer vision, speech recognition, and is gaining attractions in BCI research.
The EEG data were recorded by one prefrontal bipolar channel from 120 elderly subjects while they were performing an attention-demanded task (Stroop color test) with resting periods between trials. The EEG data were first band-pass filtered at [0.5 40] Hz and then segmented into 2-second intervals (with 50% overlapping). Moreover, the data were screened to discard noisy segments. The aim was to distinguish attention (i.e. Stroop) from non-attention (i.e. idle) trials. We deployed linear discriminant analysis (LDA) as the classification framework. Band powers (delta, theta, alpha, beta, and low gamma) were extracted using wavelet decomposition technique with 5 levels and then given to the LDA classifier, which ultimately yielded the average accuracy of 62.20%. Further, we performed an end-to-end EEG analysis, using raw EEG, by constructing a deep learning paradigm with CNN (deep CNN) for the attention detection. In this case, normalized raw EEG segments were fed into the network as input instead of spectral features. The deep CNN serves as feature learning and classification. The proposed network employs tensor-based modelling which could capture the information of the complex attentive behaviors, and therefore is able to detect attentive/non-attentive mental states. Beside convolutional and pooling layers, we employed two fully connected layers as well as dropout layers to elude overfitting. Gradually, through the layers, the highly compact deep attentional features were formed for the discrimination between attentive and non-attentive states. This technique led to a significantly higher accuracy (i.e. 71.73%) than baseline (p