A product developer’s guide to machine learning (ML) regression model metrics
Nathaniel Tjandra
Growth
TLDR
The 2 commonly used metrics in regression are Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE). Mean Absolute Error is best for simple, orderly datasets. Root Mean Squared Error is best for complex, chaotic datasets.
Outline
Introduction
Regression defined
Bob’s Boba
Absolute measures
Mean Absolute Error (MAE)
Root Mean Squared Error (RMSE)
Understanding error values
Conclusion
Introduction
Most machine learning (ML) problems fit into 2 groups: classification and regression. The main metrics used to assess performance of regression models are Root Mean Square Error and Mean Absolute Error.
Regression defined
Regression is about finding a pattern in measurable values of the past to determine changes to those values in the future. To demonstrate each of these metrics we’ll look at a scenario and create a model for it. Then, apply regression evaluation metrics to determine whether the prediction was right or wrong in order to estimate how much error there was.
In the last
journey, we heard the tale of 4 heroes using accuracy and precision to slay monsters. Now, let’s take a look at the tale from the perspective of the boba shop in the village. The boba shop is run by a shopkeeper named Bob and has 2 loyal customers named Dan and Janet.
On the 2nd day, Bob thought he would see the heroes come back with Dan and Janet. Instead, Dan and Janet came back together and told the shopkeeper that the heroes were off to continue fighting and that they planned to leave to support them.
Metrics are used to tell how well a model fits with the data. Regression metrics fall under 2 categories, relative measure and absolute measure. The most commonly used absolute measures are Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE).
Mean Absolute Error (MAE)
Mean Absolute Error represents how incorrect each value is from one another. It’s done by taking the sum of how incorrect each prediction was from the actual value.
MAE: 4
2 and 4 are incorrect by 2
6 and 2 are incorrect by 4
0 and 2 are incorrect by 2
0 and 8 are incorrect by 8
Mean Absolute Error = (2 + 4 + 2 + 8) / 4 = 16 / 4 = 4
1
2
3
4
5
6
7
from sk.learn metrics import mean_absolute_error
# Example Dataset
y_predicted = [2,6,0,0]
y_actual = [4,2,2,8]
# Calculate Mean Absolute Error
mae = mean_absolute_error(y_actual, y_predicted)
print("Mean Absolute Error:", mae)
(Source: 9gag)
The MAE is saying that the model is incorrect by 4 visits on average each day. MAE is great since it’s easy to calculate due to only having simple operations.
Root Mean Squared Error (RMSE)
Root Mean Squared Error is similar to MAE, except the difference between the predicted and actual values is squared (e.g to the power of 2), then square rooted. This heavily penalizes each error based on how large each error is. As larger errors grow, squaring a number has the value at the difference of twice as much for each error. This makes RMSE useful for volatile datasets, with many variables changing constantly during each trial. Be warned, by penalizing errors more strictly it may become a model that best fits a specific scenario rather than a general scenario, also known as
overfitting or underfitting.
RMSE: 4.69
2 and 4 are under by 2 = -2² = 4
6 and 2 are over by 4 = 4² = 16
0 and 2 are under by 2 = -2² = 4
0 and 8 are under by 8 = -8² = 64
Root Mean Squared Error Calculation
1
2
3
4
5
6
7
from sk.learn metrics import mean_squared_error
# Example Dataset
y_predicted = [2,6,0,0]
y_actual = [4,2,2,8]
# Calculate Mean Squared Error
mse = mean_squared_error(y_actual, y_predicted)
print("Mean Squared Error:", mse)
The RMSE is saying that the overall model has an error of 4.69. By penalizing the average then taking the square root, RMSE gives a good approximation of the scale of error.
Understanding error values
Reading error is not as simple as high values are good and low values are bad. Similar to classification, choosing which evaluation metric to use depends on the data.
MAE is simpler to understand, and shows how orderly errors are. RMSE is for complex datasets with multiple variables and penalizes larger errors more than MAE. Penalizing means that the model will hold itself more accountable, letting the difference in RMSE and MAE determine to what degree to judge each error. This makes each judgment have degrees of error, ranging from small errors to critical failures.
In Bob’s Boba story he was living an everyday life, filled with consistent events. Each day, only his regulars would consistently visit. This made the dataset simpler.
MAE calculation
(0 + 1 + 1 + 0) / 4 = 2 / 4 = 0.5
RMSE calculation
But as we know in Bob’s Boba story, a lot of unforeseen circumstances happened. His customers are taken captive and the presence of heroes drastically increases his store’s popularity, leading to more regular customers in the future. All this makes the dataset very chaotic, highly volatile, and extremely complex. Therefore, choosing RMSE is helpful to account for all the extra publicity and future growth.
Conclusion
RMSE and MAE are the go to metrics for evaluating regression models. There are other metrics such as
relative error metrics. The metric chosen is based on the complexity of your data. Use MAE for a dataset that is small and stable with few changing variables. However, use RMSE for datasets that are constantly changing and contain widely varying values.
Start building for free
No need for a credit card to get started.
Trying out Mage to build ranking models won’t cost a cent.
No need for a credit card to get started. Trying out Mage to build ranking models won’t cost a cent.