In this project, we propose the design and development of a functional Brain-Computer Interface (BCI) system aimed at advancing the translation of electroencephalography (EEG) signals into actionable outcomes in real-time, including imagined speech and robotic control, with potential applications in neuro-prosthetics and assistive healthcare technologies. A 14-channel EEG headset was used in acquiring data from 15 participants ranging between ages of 18 and 25. During recording, electrodes were placed in strategic areas, particularly those associated with motor and cognitive functions, to capture relevant EEG signals during mental tasks. Subjects were screened based on cognitive stability and prior EEG data consistency to ensure robust signal acquisition. Advanced preprocessing techniques, which included artefact removal and bandpass filtering, were applied to enhance signal quality by minimizing noise and preserving critical neural patterns.
We evaluated several machine learning classifiers for interpreting EEG signals, including Decision Trees, Logistic Regression, XGBoost, Support Vector Machines (SVM), K-Nearest Neighbors (KNN), Random Forests, and Gradient Boosting Machines (GBM). Each classifier exhibited unique strengths, with XGBoost and Random Forest excelling in classification accuracy. XGBoost demonstrated superior performance with an average classification accuracy of 98% for robot movement and 96% for emotion recognition.
The successful control of a robotic car wirelessly in real-time through thought-induced commands—right, left, push, and pull—illustrates the practical potential of this BCI system, paving the way for further exploration into more complex control tasks and real-world applications.
A Brain-Computer Interface (BCI) is a technology that brings about direct communication between the brain and external devices, i.e., a machine or a computer. It allows devices to be controlled by speech patterns decoded and interpreted from the brain's neural signals, thereby enabling communication between the brain and computer without relying on peripheral nerves and muscles.
Various studies have been conducted into the utilization of electroencephalography signals for Brain-computer interface use cases. These studies have consistently shown that although the non-invasiveness and mobility of these signals are beneficial, they emphasize the significance of the pre-processing techniques and advanced signal analysis methods for converting raw EEG data to meaningful actionable outcomes. Its inability to access the sensitivity of deeper brain regions is a significant drawback when compared to other methods of assessing human brain function, i.e. FMri and EcoG.
Recent studies concerning the use of Machine Language in creating a non-invasive Brain-Computer Interface to analyse emotional parameters highlighted the need for deeper investigations into the technical limitations of EEG headsets used to acquire the signal data. Also, It was shown that the lack of standardization in data collection and pre-processing methods has led to inconsistent results. Emphasis was placed on the need for more research on the use of classification algorithms to control machines through BCIs.
Although remarkable advancements have been made in BCI technology, there are still several challenges and knowledge gaps in this field. Therefore, there is a need for additional signal processing techniques to increase the accuracy and reliability of neural signals, refined and optimized machine learning, and deep learning algorithms to decode and parse through actionable outcomes.
To address these limitations, practical and intuitive BCI systems must be developed to detect emotion patterns, decode covert speech patterns, and allow the user to control robots with these speech commands. These advancements improve communication, and allow a better understanding of emotions and precise control of machines, thereby significantly increasing the accuracy and usability of BCIs in various domains.
The present study is aimed at exploring the use of BCI to create a collaborative technology that allows users to control robots with thoughts only, and simultaneously have their emotions detected and screened. 15 volunteers were recruited for the signal acquisition phase; they were selected based on the crucial criteria: coarse hair. This was done to push the boundaries of previously done research that centred on individuals with fine hair, thereby allowing the electrodes to initiate contact with the scalp with minimal effort. However, it was realized that there were difficulties initiating contact between the electrodes and coarse hair. The electrodes were placed strategically on the scalp, in areas associated with motor movements, speech production, emotion processing, language processing, auditory processing, planning, and coordination. Participants were instructed to visualise four commands (push, pull, left, right) corresponding to physical commands (forward, backward, left and right) while EEG recordings were taken. Each recording was repeated 10 times, totalling 400 recordings, and the average response for each command across participants was calculated for comparison of the average EEG activity linked to each command. This is essential for analyzing and interpreting EEG data, especially event-related potentials (ERPs), this allows us to decipher the neural responses that are time-locked to specific stimuli and commands.
The acquired signals were then inputted to the Digital Signal Processing (DSP) algorithm, to pre-process the data. Here, the signals were filtered using independent component analysis (ICA) and bandpass filtering. After preprocessing, the data was stored on the raspberry pi for further analysis.
To make the acquired data suitable for deep learning and machine learning algorithms, signal processing techniques were used to analyze and modify signals, which included noise reduction, filtering and feature extraction, The incoming data is handled by the Digital Signal Processing algorithm software, and when the processor completes the required calculations, it sends the processed data to the output.
The previous output was passed through continuous wavelet transformation to enhance the quality of data and implemented using the PyWavelets library in Python, which provides functions for wavelet-based signal processing. The CWT data was further visualized, a 2D visual representation of the CWT coefficients called scalogram, which displays frequency change over time. the 2D-generated images were fed into the CNN model to classify commands (push, pull, right, left) using visual learning,which is stored in the raspberry pi. Additionally, the raspberry pi was integrated with the headset for realtime data collection and storage which allows upscaling for more robust systems. The GPIO pins of the raspberry Pi were connected to the motor driver pins on the robot car, enabling the raspberry pi to control the robot car using real time data coming from the headset.
The data was obtained using a 14 channel EEG Headset, acquiring raw data from electrodes placed at the international 10-20 system positions of AF3, F7, F3, FC5, T7, P7, O1, O2, P8, T8, FC6, F4, F8, and AF4.
The electrodes AF3, AF4, F3, F4, F7, and F8 are used to image neuronal activity in the subject's lobus frontalis, and the lobus temporalis of the brain is being scanned by electrodes FC5, FC6, T7, and T8. P7 and P8 electrodes are used to scan the lobus parietalis. The O1 and O2 electrodes are used to scan the neural activity of the lobus occipitalis (Fig. 1).
The Emotiv Epoch has an inbuilt software that allows customising different parameters, the sampling rate was set at 256 Hz, the recording channels was set to 14 channels, and configuring necessary filters such as the notch filter and the Butterworth filter.
Equal representation for both genders was achieved with seven male and eight female participants ranging in age from 18 to 25. A crucial selection criterion was short hair, as long or very coarse hair could hinder electrode insertion and signal transmission, creating undesirable artefacts and lowering the signal-to-noise ratio (SNR).
In other to train the four (push, pull, left, right) commands, each subject was asked to perform certain actions, excluding the neutral baseline. Firstly, neutral EEG recordings were taken, which required the subjects to relax and perform calming breathing exercises to ensure their brains were being recorded when brain activity is at a low. In each of the commands, the participant is asked to visualize a command, using any mental imagery that triggers the feeling and muscular tension of the action. For the left and right commands, subjects needed a physical feeling of the action pulling to trigger a mental command.
Each participant performed his reaction for 8 seconds by repeatedly visualising the necessary instruction; these steps occur 10 times, and the same conditions apply to ensure viable results.
The participant then began performing his reaction for 8 seconds by repeatedly visualising the necessary instruction; these steps occur 10 times, and the same conditions apply to ensure viable results.
After the signal processing algorithm, a Continuous Wavelet Transformation (CWT) was applied on the data to further extract high quality usable data. The Continuous Wavelet Transform (CWT) is a signal processing technique that allows for the extraction and separation of frequency information from a time series, while retaining the time domain information. It is similar to the Fourier Transform, but it can also display the frequency data and where it occurred along the time series.
The raw EEG signal data gotten from signals picked from the scalp does not represent an accurate representation of the brain signals. Pre-processing begins by filtering out interfering frequencies. Power line noise (50/60 Hz) is a common intruder, was tackled with notch filters. Electromyographic (EMG) noise from muscle activity was reduced through bandpass filtering, focusing on the relevant frequency bands for brain signals (typically 0.1– 40 Hz).
For a specific EEG channel, where n is AF3, at a specific time t, the EEG signal can be represented as the sum of multiple sinusoidal components
Once the EEG signals have been processed, and visualised using scalogram, the next step was to pass it through a machine learning model for the classification task. XGBoost was proposed among other machine learning models due to its lightweight and accuracy to run on a raspberry pi.
XGBoost is ensemble machine learning model which is an efficient implementation of the Gradient Boosted Trees algorithm, a supervised learning method that is based on function approximation by optimizing specific loss functions as well as applying several regularization techniques. XGBoost is memory efficient and ideal for limited memory use-cases.We randomly shuffle and divide the data (4,0000 signals from 15 individuals) into train (80%), and test sets (20%). The four commands (push, pull, left, right) were trained using the XGBoost model and achieved an accuracy of 97%.
The EEG headset utilized to carry out the project was the Emotiv Epoc X 14 channel headset, which transmits data wirelessly. Due to the design of Cortex headsets, the EEG data extracted was not immediately sent to the robot, it had to pass go through a laptop connected to both the headset and raspberry pi on the same local network. The data was sent in under .5 milliseconds through the websocket connection to the pi.
The cortex python library was utilized in the server in to connect directly with the headset, it required the use of a licence which was obtained from the documentation on the website.
In order to demonstrate the classification commands in real time, we proposed assembling a robot using the Raspberry Pi as the micro controller. We chose the Raspberry Pi 4 to meet these criteria. This low-cost, credit card-sized computer has the processing power we need for our project, as well as 26 general purpose input/output (GPIO) pins. These GPIO pins are used to connect motor driver pins, allowing effective communication with the motor and allowing integration with various analog or digital sensors
The EEG data obtained from the headset was plagued with artifacts. In other to cleanup the signal, filtering techniques were applied after selecting reference as EEG signals cover a wide range of frequencies. We utilized the Low-pass filters reduced high-frequency noise from power lines, while high-pass filters removed low-frequency fluctuations. We then isolated the precise frequency range between 8Hz and 250Hz needed by using bandpass filtering.
We then used the simple threshold-based method to remove artifacts.To then perform data normalisation on the signal, we divided the output of the FFT (Fast Fourier Transform) by the window length and multiplied the Hanning window by two (since the window reduces the amplitude of the Fourier transform by a factor of two); therefore the DC value is irrelevant and only distorts the Fourier transform of the lower frequencies, we also subtract the mean of the data for each epoch/window.
Before inputing the raw EEG data into the machine learning model, by making use of Continuous Wavelet Transform (CWT), the EEG signals went through a transformative process, transitioning into a scalogram representation.
This method was taken to offer a multi-resolution view of the EEG data, capturing both temporal and frequency-domain information. By employing CWT, subtle patterns and features inherent in the EEG signals are efficiently extracted, empowering the machine learning model to discern intricate nuances and subtle variations crucial for accurate classification and analysis. Fig 4.9 shows the scalogram image converted using CWT.
This scalogram above is a visual representation of the frequency distribution of EEG signals over 10 seconds for each channel during the execution of the "Push" command.
In order to pick the most optimal algorithm to use. We investigated a range of supervised learning techniques in our classification investigation, such as Random Forests, XGBoost (eXtreme Gradient Boosting), Decision Trees, Logistic Regression, Support Vector Machines (SVM), K-Nearest Neighbors (KNN) and Random Forests. To find the best model for our BCI application, we compared the performance of each algorithm using accuracy. As shown in Table 3 above, there is a comparison of all the models across the four commands, using five random subjects as the feature sample.
This evaluation revealed that XGBoost had the highest average classification accuracy. The accuracy XGBoost provided on our dataset was about 98% on average across the four commands. In comparison to the other models, this was the highest accuracy provided.
For XGBoost, we had a pretty accurate confusion matrix, although the model still has issues classifying the “Pull (4)” command. Notably, the accuracy under performs compared to other true positive labels. This discrepancy suggests that the model exhibited a significant number of false positives and false negatives across multiple classes, with a notable instance being 130 false positive classifications for the "push" command (label 1).
It is also important to remember that in our current study, we only conducted each recording of the subject for 10 minutes. It is possible that some pathological evidence exists in the remaining parts of recordings.
Command | Decision Tree | Logistic Regression | Random Forests | SVM | XGBoost | KNN | |
---|---|---|---|---|---|---|---|
Subject 1 | |||||||
0 | Neutral | 1.0000 | 0.8999 | 0.9998 | 0.9026 | 0.9991 | 0.9236 |
1 | Push | 0.8687 | 0.0000 | 1.0000 | 0.2105 | 0.9500 | 0.5027 |
2 | Left | 0.8280 | 0.0000 | 1.0000 | 0.7333 | 0.9000 | 0.5000 |
3 | Right | 0.7895 | 0.1667 | 1.0000 | 0.5294 | 0.9636 | 0.5534 |
4 | Pull | 0.8300 | 0.0000 | 1.0000 | 0.6667 | 0.9143 | 0.5611 |
Subject 2 | |||||||
0 | Neutral | 1.0000 | 0.9937 | 0.9937 | 0.9947 | 1.0000 | 0.9947 |
1 | Push | 0.3333 | 0.2454 | 1.0000 | 0.9947 | 0.0000 | 0.3529 |
2 | Left | 0.9286 | 0.0000 | 1.0000 | 0.3333 | 0.9767 | 0.3529 |
3 | Right | 0.9444 | 0.0000 | 1.0000 | 0.2353 | 0.9444 | 0.6667 |
4 | Pull | 0.8125 | 0.0000 | 1.0000 | 0.5000 | 0.9681 | 0.6667 |
Subject 3 | |||||||
0 | Neutral | 1.0000 | 0.9222 | 0.9988 | 0.9350 | 0.9994 | 0.9614 |
1 | Push | 0.8936 | 0.0000 | 1.0000 | 0.6000 | 0.9524 | 0.7105 |
2 | Left | 0.7500 | 0.0000 | 1.0000 | 0.2233 | 1.0000 | 0.0000 |
3 | Right | 1.0000 | 0.0000 | 1.0000 | 1.0000 | 0.7778 | 0.3333 |
4 | Pull | 0.9667 | 0.3529 | 1.0000 | 0.7714 | 0.9681 | 0.7222 |
Subject 4 | |||||||
0 | Neutral | 1.0000 | 0.9395 | 0.9994 | 0.9457 | 0.9996 | 0.9611 |
1 | Push | 0.9486 | 0.0000 | 0.9342 | 0.6471 | 0.9261 | 0.7097 |
2 | Left | 0.6429 | 0.0000 | 1.0000 | 0.0000 | 1.0000 | 0.4286 |
3 | Right | 0.8500 | 0.0000 | 1.0000 | 0.4545 | 1.0000 | 0.7500 |
4 | Pull | 0.8409 | 0.0000 | 0.9870 | 0.6111 | 0.9868 | 0.6275 |
Subject 5 | |||||||
0 | Neutral | 1.0000 | 0.7343 | 0.9991 | 0.8395 | 0.9983 | 0.8324 |
1 | Push | 0.9749 | 0.0000 | 0.9930 | 0.9943 | 0.9825 | 0.6070 |
2 | Left | 0.7500 | 0.0000 | 1.0000 | 0.2308 | 0.9444 | 0.6667 |
3 | Right | 0.7500 | 0.0000 | 1.0000 | 0.6667 | 1.0000 | 1.0000 |
4 | Pull | 0.8417 | 0.5833 | 0.9333 | 0.7419 | 0.9160 | 0.7368 |
** Performance of XGBoost Model
A confusion matrix is a tabular representation that allows for a comprehensive evaluation of the performance of a machine learning model, Figure below shows a typical 2 by 2 confusion matrix. It presents a breakdown of predictions made by the model compared to the actual ground truth across different classes or categories. Here's what a confusion matrix typically shows:
Our work compared several machine learning classifiers, including Decision Trees, Logistic Regression, XGBoost, Support Vector Machines (SVM), K-Nearest Neighbors (KNN), Random Forests, and Gradient Boosting Machines (GBM). These classifiers were evaluated for their ability to interpret EEG signals for tasks such as imagined speech, robotic control, and emotion recognition. The findings revealed differences in accuracy, computational efficiency, and applicability for various applications.
One of the most notable findings was the difference in computational requirements and processing speeds among the various machine learning algorithms. While some classifiers demonstrated high computational efficiency and real-time processing capabilities, others presented difficulties in terms of computational complexity and processing time. This aspect of computational power and speed was critical in determining whether the classifiers were practical for use in real-world BCI systems.
Another important aspect of the research was cost analysis, which took into account the implementation of machine learning algorithms in BCI systems. Software licensing fees, hardware requirements, and development costs were all considered to determine the overall cost-effectiveness of incorporating specific classifiers into BCI prototypes. Cost considerations influenced the decision to select classifiers for practical implementation.
Despite the promising performance of some classifiers, we identified several issues and limitations that must be addressed. These included concerns about classifier robustness, generalizability across diverse user populations, sensitivity to noise and artifacts in EEG signals, and interpretability of classifier outputs. Addressing these bottlenecks and limitations is critical to improving the reliability and usability of BCI systems in real-world situations.
Our findings have important implications for the development of brain-computer interface (BCI) technology. We evaluate various machine learning classifiers to gain insights into their performance, computational requirements, and suitability for real-time BCI applications. Understanding these factors is critical for improving the user experience and the usability of BCI systems. Furthermore, we shed light on the costs associated with incorporating specific classifiers into BCI systems, emphasizing the importance of identifying cost-effective solutions that strike a balance between performance and affordability.
Furthermore, we emphasize the importance of addressing limitations such as classifier robustness, generalizability, and noise susceptibility, which are critical for improving the reliability and usability of BCI technology in real-world settings. Moving forward, future research directions may include further optimization of machine learning algorithms, exploration of novel techniques to improve classifier performance, and integration of advanced signal processing methods.
Several key objectives were identified and pursued throughout this study in order to achieve the overall goal. One limitation of our research is the number of commands and speech patterns examined. While we focused on commands like Push, Pull, Left, and Right, there are numerous other commands and speech patterns that could be integrated into the brain-computer interface (BCI) system.
One limitation of our research is the number of commands and speech patterns examined. While we focused on commands like Push, Pull, Left, and Right, there are numerous other commands and speech patterns that could be integrated into the brain-computer interface (BCI) system. Understanding more complex speech patterns, such as "hello," "hi," or other sentences, could significantly improve the system's usability and flexibility in real-world scenarios. Expanding the repertoire of commands beyond those studied may increase the BCI system's versatility and applicability in a variety of contexts.
Increasing the detection capabilities for speech patterns is another area for improvement. By improving the algorithms and signal processing techniques used for speech pattern detection, we may be able to improve the system's ability to accurately interpret and respond to a broader range of voice commands. These limitations highlight opportunities for future research to improve the functionality and effectiveness of BCI systems for a variety of applications.
[1] M. M. AlSaleh, R. K. Moore, H. Christensen, and M. Arvaneh, “Examining
Temporal Variations in Recognizing Unspoken Words Using EEG Signals,”
2018, doi: 10.1109/SMC.2018.00173.
[2] C. Pan, Y.-H. Lai, and F. Chen, “The Effects of Classification Method and
Electrode Configuration on EEG-based Silent Speech
Classification,” 2021, doi: 10.1109/EMBC46164.2021.9629709.
[3] P. Kumar and E. Scheme, “A Deep Spatio-Temporal Model for EEG-Based
Imagined Speech Recognition,” 2021, doi:
10.1109/ICASSP39728.2021.9413989.
[4] I. Marin, M. J. H. Al-Battbootti, and N. Goga, “Drone Control based on
Mental Commands and Facial Expressions,” 2020 12th International
Conference on Electronics, Computers and Artificial Intelligence (ECAI), vol. null, pp. 1–4, 2020, doi:
10.1109/ECAI50035.2020.9223246.
[5] M. M. AlSaleh, M. Arvaneh, H. Christensen, and R. K. Moore, “Brain-computer interface technology for speech recognition: A review,” 2016, doi: 10.1109/APSIPA.2016.7820826.
[6] C. Liu et al., “Comparison of EEG source localization using simplified and anatomically accurate head models in younger and older adults,” preprint, Jan. 2023. doi: 10.36227/techrxiv.21867834.
[7] N. K. Logothetis, “What we can do and what we cannot do with fMRI,” Nature, vol. 453, no. 7197, pp. 869–878, Jun. 2008, doi: 10.1038/nature06976.
[8] X. Jiang, G.-B. Bian, and Z. Tian, “Removal of Artifacts from EEG Signals: A Review,” Sensors, vol. 19, no. 5, p. 987, Feb. 2019, doi: 10.3390/s19050987.
[9] M.-P. Hosseini, A. Hosseini, and K. Ahi, “A Review on Machine Learning for EEG Signal Processing in Bioengineering,” IEEE Rev. Biomed. Eng., vol. 14, pp. 204–218, 2021, doi: 10.1109/RBME.2020.2969915.
[10] S. Stalin et al., “A Machine Learning-Based Big EEG Data Artifact Detection and Wavelet-Based Removal: An Empirical Approach,” Mathematical Problems in Engineering, vol. 2021, pp. 1–11, Oct. 2021, doi: 10.1155/2021/2942808.
We would like to first of all give thanks to the Almighty, to whom belongs the knowledge of all things, for guiding us throughout the period of this project. We would also like to acknowledge our parents' effort, love, and unconditional support throughout this arduous time; it would not have been possible without their support and continuous prayers.
We would also like to express our sincere gratitude to the Head of Department at Nile University of Nigeria, Abuja, Dr. Ali Nyangwarimam Obadiah, for his guidance and support. Special thanks to our supervisor, Engr. Akah Precious, for his guidance throughout. We sincerely extend our gratitude to Prof. Sadiq Thomas and the other lecturers at our university, as well as those from other institutions whom we reached out to, for their invaluable support and guidance.
Finally, we would like to appreciate our family, friends, coursemates, and well-wishers.
There are no models linked
There are no models linked