DataScrub.py
processes EEG data collected using the Muse EEG headset, along with associated game behavior data. The script cleans and structures the data, making it suitable for further analysis using both Python and MATLAB. The output includes structured data in the form of dictionaries and data frames that can be saved and reused efficiently.
-
Set Up Environment: Ensure you have the necessary Python libraries installed. Use the
requirements.txt
file for easy setup:pip install -r requirements.txt
-
Run the Script: Execute the script in a Jupyter notebook or an IPython environment to interact with the input widget.
-
Provide CSV File: Enter the file path to the Muse CSV file when prompted.
-
View Processed Data: The script will process the data and output structured dictionaries and data frames. Plots will be generated if
plot_data_aspects
is set toTrue
. -
Save Data: The processed data can be saved in both MATLAB and Python formats as specified in the script.
- The script expects an input CSV file from the Muse EEG headset. The file path is provided via a widget interface.
- The CSV file is modified to ensure it has the correct headers and removes unnecessary spaces to reduce file size.
- Bring into
pandas.DataFrame
- The modified CSV file is read into a pandas DataFrame for flexible data manipulation.
- Unique row components
- The unique components in the
Submod
column are identified and their counts are printed.
- The unique components in the
- Helper Functions
- Helper functions facilitate nested assignments in dictionaries and extract specific data rows.
- Data Structure Creation
- Functional Data Structure
- Dict:
eeg
- The
eeg
dictionary contains the raw EEG data, relative and absolute power bands, and FFT data. Reference and DRL signals are handled separately.
- The
- Dict:
behavior
- The
behavior
dictionary includes signals related to headband connectivity, muscle movements, and head acceleration.
- The
- Dict:
- Description
- Descriptive fields are added to the
eeg
andbehavior
dictionaries for context.
- Descriptive fields are added to the
- Functional Data Structure
- The game file, which tracks game behavior, is processed similarly. It is read into a pandas DataFrame and converted into a dictionary.
- Modify/Input Space-separated File
- The game file is checked for headers and modified if necessary.
- Bring into
pandas.DataFrame
- The game file is read into a pandas DataFrame for verification.
- Dict:
game
- The game data is converted into a dictionary with numpy arrays for consistency and ease of saving.
- Modify/Input Space-separated File
- Matlab Native
- The structured data is saved in MATLAB format using
scipy.io.savemat
.
- The structured data is saved in MATLAB format using
- Python Native
- Pickle Data Frames
- If specified, the pandas DataFrames for Muse and game data are saved using pickle.
- Pickle EEG / Behavior / Game Dictionaries
- The dictionaries for
eeg
,behavior
, andgame
data are saved using pickle for efficient storage.
- The dictionaries for
- Pickle Data Frames