Predictive Performance Models Evaluation Metrics

Predictive performance modeling has been in the frontline of the fight against the COVID-19. It’s been helping predict the virus prevalence and decide for the measures to respond to it effectively. Moreover, predictive analysis in business has become a trusted advisor to many businesses, and for a good reason. These models can “predict the future”, and there are many different techniques available, meaning any industry can find one that fits their particular challenges.

However, the abundance of predictive modeling techniques and predictive analytics software means that there are multiple models that can provide a good predictive evaluation to a simple problem, and choosing the right one can be challenging.

To combat this, one must understand the model performance evaluation by picking metrics that truly measure how well each model achieves the overall business goals of the company.

Predictive Models Performance Evaluation is Important

Choice of metrics influences how the performance of a performance evaluation model is measured and compared. But metrics can also be deceiving. If we are not using metrics that correctly measure how accurate the model is predicting our problem, we might be fooled to think that we built a robust model. Let’s take a look at an example to understand why that can be a problem and how predictive analytics can cope with it..

Take, for example, prediction of a rare disease that occurs in 1% of the population. If we use a metric that only tells us how good the model is at making the correct prediction, we might end up with a 98% or 99% accuracy because the model will be right 99% of the times by predicting that the person does not have the disease. That is, however, not the point of the model.

Instead, we might want to use a metric that evaluates only the true positives and the false negatives, and determines how good the model is at prediction of the case of the disease.

Proper predictive performance models evaluation is also important because we want our model to have the same predictive evaluation across many different data sets. In other words, the results need to be comparable, measurable and reproducible, which are important factors for many industries with heavy regulations, such as insurance and the healthcare sector.

Let’s now dive into prediction performance, the most commonly used metrics, their use cases, and their limitations.

How to Evaluate Model Performance and What Metrics to Choose

All problems a performance evaluation model can solve fall into one of two categories: a classification problem or a regression problem. Depending on what category your business challenge falls into, you will need to use different metrics to evaluate your model.

That is why it is important to first determine what overall business goal or business problem needs to be solved. That will be the starting point for your data science team to choose the metrics, and ultimately determine what a good model is.

Classification Problems

A classification problem is about predicting what category something falls into. An example of a classification problem is analyzing medical data to determine if a patient is in a high risk group for a certain disease or not.

Metrics that can be used for evaluation a classification model:

Percent correction classification (PCC): measures overall accuracy. Every error has the same weight.
Confusion matrix: also measures accuracy but distinguished between errors, i.e false positives, false negatives and correct predictions.

Both of these metrics are good to use when every data entry needs to be scored. For example, if every customer who visits a website needs to be shown customized content based on their browsing behavior, every visitor will need to be categorized.

If, however, you only need to act upon results connected to a subset of your data – for example, if you aim to identify high churn clients to interact with, or, as in the earlier example, predict a rare disease – you might want to use the following metrics:

Area Under the ROC Curve (AUC – ROC): is one of the most widely used metrics for evaluation. Popular because it ranks the positive predictions higher than the negative. Also, ROC curve is independent of the change in proportion of responders.
Lift and Gain charts: both charts measure the effectiveness of a model by calculating the ratio between the results obtained with and without the performance evaluation model. In other words, these metrics examine if using predictive models has any positive effects or not.

Regression Problems

A regression problem is about predicting a quantity. A simple example of a regression problem is prediction of the selling price of a real estate property based on its attributes (location, square meters available, condition, etc.).

To evaluate how good your regression model is, you can use the following metrics:

R-squared: indicate how many variables compared to the total variables the model predicted. R-squared does not take into consideration any biases that might be present in the data. Therefore, a good model might have a low R-squared value, or a model that does not fit the data might have a high R-squared value.
Average error: the numerical difference between the predicted value and the actual value.
Mean Square Error (MSE): good to use if you have a lot of outliers in the data.
Median error: the average of all difference between the predicted and the actual values.
Average absolute error: similar to the average error, only you use the absolute value of the difference to balance out the outliers in the data.
Median absolute error: represents the average of the absolute differences between prediction and actual observation. All individual differences have equal weight, and big outliers can therefore affect the final evaluation of the model.

Conclusion

To get the true value of a predictive model, you have to know how good your model fits the data. Your model should also withstand the change in the data sets, or being put through a completely new data set.

To start, you need to get clear about what business challenge this model is helping solve. This process will define if you are working with a classification or a regression problem, and ease the process of choosing the right metrics and predictive measures.

As we mentioned in the beginning, there are multiple models that can be a good fit for your particular business problem. That is why prediction by evaluation is a process where you benchmark models against each other to find the best fit.

Work with InData Labs on Your Predictive Analytics and Machine Learning Project

Have a project in mind but need some help implementing it? Drop us a line at info@indatalabs.com, we’d love to discuss how we can work with you.

What is Predictive Performance Models and Why Their Performance Evaluation is Important