At the core of a machine learning system for diagnostics you will find an element denoted as the classifier. This is common to several application fields of Machine Learning and Computational Intelligence that can be applied as well when using such methodologies for analyzing EEG. Let us imagine a Parkinson prediction system1 similar to the one described in my last post. The input of the system is the EEG data acquired from a subject who would like to get a prediction. The output of the system is a decision whether the subject is expected to present Parkinson’s in the future or not. The same rationale can be used for diagnostic systems like the ones used to determine if patients suffer from epilepsy to name a standard application of EEG2.
I will try to explain in the following paragraphs different types of classifiers you can find in such systems. This taxonomy is based on the different outputs that a classifier can deliver. I have taken the taxonomy from a book by Bezdek and colleagues3 and tried to simplify it.
Crisp classifiers
In the case that the classifier system issues a decision you are using a so-called crisp classifier. The output of the classifier is binary and can take values 1 or 0. The system makes an absolute prediction or diagnosis, you have or do not have epilepsy, you have or will have Parkinson’s. It assigns the subject to one of the two classes, e.g. epileptic patients or healthy subjects. In case of a diagnostic system this is probably what you need, but in the prediction case such an output can be very risky.
Fuzzy or probabilistic classifiers
In the prediction case what you would better have is a probability of the subject getting the disease. This type of output must fulfill different properties to be a probabilistic measure as I mention in the next section. However the only property we are interested for the moment is that the output is a real value between 0 and 1. This output is known as the classification score. We can think of the classification score as a probability, but we can think as well as a degree of membership to a class, e.g. a value of 0.6 may indicate that your EEG shares some traits with 60% of epileptic patients, or 60% of epileptic patients have a similar EEG than yours, or any other interpretation. The classifiers issuing such a real valued classification score are called fuzzy or probabilistic classifiers.
In case you would like to make a decision on this value you would need to transform it in a binary value. This binary value answers the question: Are you suffering from epilepsy? Can you expect to get Parkinson’s? So we are again in the crisp realm, where some humans feel more confident. The element in the system that transforms the classification score into a binary decision is the so-called decision threshold.
The advantage of fuzzy or probabilistic classifiers with respect to the crisp ones is that you can tune the decision threshold to different applications. In a system for disease screening, you would like to detect all subjects presenting the disease at the cost of detecting several people that do not present it, so-called false positives. On the other hand, in a definite diagnostic system you would like to have no false positives. This is the kind of tuning you can make by varying the decision threshold. This flexibility is an important feature in real-world systems.
Possibilistic classifiers
We have not commented on a feature of probabilistic classifiers so far. Here all output scores have to sum up to 1. This means that if you have a two class problem and one of the classes issues a membership degree 0.7, the score of the other class needs to be 0.3 in order for the classifier to be considered a fuzzy or probabilistic one. If this condition is not fulfilled, you are working with a so-called possibilistic classifier. In this case the output of the classifier system can be a 0.7 membership degree to the epileptic class, whereas presenting 0.6 membership to the non-epileptic class. As you can see possibilistic classifiers are more flexible than probabilistic ones. This has some advantages, but I will talk about in a future post.