Skip to content

A guide to the analytics figures

MariusKlug edited this page May 18, 2022 · 7 revisions

Throughout the pipeline, figures are created to enable quick checks if the processing worked as anticipated. They are located within the data folder of each participant and level. They contain the participant prefix and number in the beginning (e.g. 'sub-' and 66, respectively), and all plots are created both as .png and as .fig files (to be opened in MATLAB).

EEG Preprocessing

All plots to check the EEG preprocessing are stored in the respective folder (3_EEG_preprocessing by default).

Raw data

The first thing to check is the raw data. This can be done in the sub-66_raw.png figure, where six 10 second sections of the data are plotted:

plot of raw data

These six sections will be used to plot the different cleaning steps to enable direct comparison.

Zapline-Plus

Zapline-Plus automatically detects and removes spectral noise artifacts. This means not only 50 Hz line noise but also other noise that can arise from devices in the lab. Analytics plots are saved for every noise frequency. The figures contain the frequency in their name, e.g. sub-66_basic_prepared.set_zapline_28_2421.png is the plot for the detected noise frequency of 28.2421 Hz and sub-66_basic_prepared.set_zapline_49_9859.png the one for 49.9859 Hz.

Ideally, the cleaned ratio of noise to surroundings should be close to 1, not much below 0.95 and not much above 1.2.

If the noisefreqs parameter is set to the string 'line', only line noise in either 50 or 60 Hz (automatic selection) is detected and removed, if left empty (default), spectral artifacts are searched and removed in the 17-99 Hz range.

For an in-depth understanding on how to use Zapline-plus and how to interpret the plots in detail, see the wiki of Zapline-plus.

zapline 28hz zapline 50hz

The following plot shows a frequency artifact that appears to be difficult to remove for Zapline-plus while not being particularly problematic in the first place. In these cases, it could be useful to restrict the maximum cleaning frequency (maxfreq parameter) to 52 Hz so it includes line frequency but not this one. Alternatively, one could increase the coarseFreqDetectPowerDiff so the detector only detects stronger artifacts. However, the negative impact of this cleaning is ensured to be minimal by the adaptation of Zapline-plus itself, so it won't be too problematic even if this cleaning is not removed.

zapline 56hz

For an in-depth understanding on how to use Zapline-plus and how to interpret the plots in detail, see the wiki of Zapline-plus.

Bad channel detection and interpolation

We use the clean_rawdata EEGLAB plugin to detect bad channels. However, as this plugin uses a random initialization it does not always find the same channels. This only happens after a restart of MATLAB, of when the cache is cleared (which happens within the BeMoBIL pipeline). Finally only channels that are flagged as bad more than a given proportion (adjustable in the config file) are taken as bad. To make this process transparent it is plotted in sub-66_interpolated_channels.png. Here the bad channels of each iteration are shown in the first plot, then their ratio and the final detected bad channels. The order of the channels is the same as in the figures of the data itself. In case a reference channel was entered in the config, this reference channel will be visible as an added channel at the bottom (which was done in this example), but this added channel will be discarded and only created newly after the interpolation.

bad channel detection

Subsequently, the final bad channels are visualized in sub-66_bad_channels.png, which contains the same data sections as the raw data plot. In case a reference channel was entered in the config, this reference channel will be visible as an added channel at the bottom (which was done in this example), but this added channel will be discarded and only created newly after the interpolation.

bad channels

Lastly, these channels will be removed and interpolated, and the data will be finally average referenced and visualized in sub-66_interpolated_channels.png. Again, in case a reference channel was entered in the config, this reference channel will now be visible at the bottom (which was done in this example). This is the channel that will be available for analysis. Other than the use of ICA this data is fully preprocessed.

interpolated channels

AMICA

All plots to check the adaptive mixture independent component analysis (AMICA) processing are stored in the respective folder (4_spatial-filters\4-1_AMICA by default).

To allow the direct check of the ICA result topographies, all ICs are plotted in sub-66_all_ICs.png.

all ICs

Additionally, we plot the samples that were automatically rejected by AMICA in sub-66_AMICA_autoreject.png. We do not use another time-domain cleaning of artifacts as in our experience AMICA knows best what fits and what does not. This comes with the drawback of small numbers of consecutive samples being removed, which makes the plot look more red than it should (as the red removed samples are plotted last and are always at least one pixel wide). Be sure to check the % value at the top. This should be in the range of 2-10%, depending on the signal quality of your data. If you want to check samples in detail, you can also open the .fig file in MATLAB. Only the first 10th of the channels is plotted here since the figure otherwise becomes too large. Note in this example that the first part of the data was more noisy than the second part, as the data contains merged datasets from mobile and stationary conditions.

amica autoreject

Final single-subject data

In the final single-subject folder (5_single-subject-EEG-analysis by default) are plots about the final cleaned file. All these deal with automatic ICA cleaning using ICLabel. The final data without ICLabel cleaning is stored in sub-66_preprocessed_and_ICA.set and it is identical to the plot of the interpolated channels of the EEG preprocessing. The dataset only contains additional info of ICA and is otherwise unchanged.

The sub-66_cleaned_with_ICA.set data, however, has all artifact ICs, as classified by ICLabel with the parameters of the config, removed (by default, only brain ICs are kept). To allow a check of this process, the remaining brain IC topographies are plotted in sub-66_cleaned_with_ICA_ICs_kept.png and their equivalent dipoles are plotted in sub-66_cleaned_with_ICA_brain_dipoles.png.

Note that this is a very important part to check because ICLabel is the weakest link in the chain of automatic MoBI data processing. It is the best tool we have, but the results are not always perfect, as can be seen in IC 49 of this participant (clearly a bad channel, misclassified as brain).

ICs kept brain dipoles

Since the dipoles are plotted in 3D it is recommended to also check the .fig file in MATLAB to move the view around.

As the last plot, in sub-66_cleaned.png the final cleaned data are visualized again in the six data sections to allow the full comparison with the raw data.

final cleaned data

Taken together, these plots allow the full examination of all relevant intermediate steps as well as the final result of the EEG processing pipeline.