Python Guide to Precision-Recall Tradeoff

Published on June 10, 2021
by Vijaysinh Lendave

What do you think should we consider only the accuracy score as a benchmark for our classification task? Many beginners in this field have misunderstood; getting good accuracy for classification models means they have built a perfect model which classifies every instance. Well, you can consider only accuracy as a benchmark for regression problems.

For better understanding, let’s take a famous and general example that every Data Science enthusiast comes through, i.e. Diabetes Prediction. So here, both classes means whether a person has diabetes or not is equally important under different conditions. Say you have trained your model for 200K samples with 180K samples as a negative class, 20K samples as a positive class, and you have achieved accuracy greater than 95% sounds good. Hold on! While this solution has nearly perfect accuracy, this problem is one in which accuracy is clearly not a proper metric to be used!

Diabetes detection and similar problems are mostly imbalanced classification tasks where most data points represent a negative class, and a positive class greatly outnumbered with a negative class. This is a fairly common problem in most of the classification tasks. In such a case the only accuracy metric is not a correct measure to check the performance of the model. Here Precision and Recall come in pictures when we want to get a clear insight about the performance of each class.

Let’s quickly jump to the coding, where we can check these things practically.

Code Implementation for Precision-Recall Tradeoff:

The Popular Heart Diseases dataset from the UCI repository is used to predict whether the patient is suffering from heart illness. You can download Dataset from here.

Importing all libraries:

 import pandas as pd
 import numpy as np
 import matplotlib.pyplot as plt
 import seaborn as sns
 from sklearn.model_selection import train_test_split
 from sklearn.linear_model import LogisticRegression
 from sklearn.metrics import confusion_matrix,classification_report
 from sklearn.metrics import precision_recall_curve

Selecting input features, target variable, train test split:

 df = pd.read_csv('heart.csv')
 df.head()

 x = df.drop(['target'],axis=1)
 y = df.target
 x_train, x_test, y_train, y_test = train_test_split(x,y, test_size =.33, random_state = True)

Training the model and Classification Report:

Logistic Regression is used to fit the model and to get clear information about precision and recall values. Solver changed from ‘lbgfs’ to ‘newton-cg’ to avoid model convergence issues.

 model = LogisticRegression(solver='newton-cg').fit(x_train,y_train)
 y_predict = model.predict(x_test)
 print(classification_report(y_test,y_predict))

Deep dive to precision-recall Tradeoff:

To proceed further, one should know the confusion matrix.

source

Precision:

The factor which told the exactness of the model. In our example from patients suffering from heart illness, how many are correctly predicted as positive.

We can say this model has predicted 77% correctly from the actual ground truth from the classification report.

Recall:

The factor which told the completeness of the model in other words measure of the model correctly identifying True Positives.

For our case, the recall for the positive class is 0.81. Recall gives information about how accurately our model is able to identify the relevant data. The recall is also referred to as Sensitivity and True +ve rate.

Precision-Recall Tradeoff:

For any problem, we mainly have to focus on either of the class or both. In our example, the aim of the model should have high recall means should have a lower number of false negatives. So if we say the model predicted that a person is not having heart illness, then he should not have heart illness.

If your main focus is to detect a person having heart illness, your model should have high precision, which means you have to lower the False Positive.

Unfortunately, you can’t have both precision and recall high. If you increase precision, it will reduce recall and vice versa. This is called the precision/recall tradeoff.

Classifier performs differently for different threshold values means positive and negative predication can be changed by setting the threshold value. Scikit does not provide a facility to set the threshold value but gives access to the decision score used in the backend to make predictions. You can find here how to use a decision score to change precision and recall for your model and to find a tradeoff point.

Here we are using a graphical method to detect tradeoffs between precision and recall.

 y_decision_function = model.decision_function(x_test)
 precision,recall,threshold = precision_recall_curve(y_test,y_decision_function)
 plt.plot(recall,precision)
 plt.xlabel('Recall')
 plt.ylabel('Precision')
 plt.title('Precision Recall Tradeoff')
 plt.show()

From the above graph, see the trend; for precision to be 100%, we are getting recall roughly around 40%. You might choose the Tradeoff point where precision is nearly 87% and recall is around 70% from the graph. Again it depends on your problem or your priority which satisfies the needs of the actual problem.

So this is all about Precision-Recall Tradeoff, where we understood terminologies with great details and practical examples. Whenever you solve classification problems, imbalance in the outcome variable plays a very important role. My previous article where classification problem outcome variables are nearly balanced results in overall high precision and recall. So make sure you maintain a balance between the classes this helps to get overall good results.

References:

Access all our open Survey & Awards Nomination forms in one place >>

Vijaysinh Lendave

Vijaysinh is an enthusiast in machine learning and deep learning. He is skilled in ML algorithms, data manipulation, handling and visualization, model building.

Python Guide to Precision-Recall Tradeoff

Code Implementation for Precision-Recall Tradeoff:

Importing all libraries:

Selecting input features, target variable, train test split:

Training the model and Classification Report:

Deep dive to precision-recall Tradeoff:

Precision:

Recall:

Precision-Recall Tradeoff:

References:

Vijaysinh Lendave

Download our Mobile App

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

3 Ways to Join our Community

Telegram group

Discord Server

Subscribe to our Daily newsletter

Get our daily awesome stories & videos in your inbox

Recent Stories

World's Biggest Media & Analyst firm specializing in AI

Advertise with us

AIM publishes every day, and we believe in quality over quantity, honesty over spin. We offer a wide variety of branding and targeting options to make it easy for you to propagate your brand.

Branded Content

AIM Brand Solutions, a marketing division within AIM, specializes in creating diverse content such as documentaries, public artworks, podcasts, videos, articles, and more to effectively tell compelling stories.

Corporate Upskilling

ADaSci Corporate training program on Generative AI provides a unique opportunity to empower, retain and advance your talent

Hackathons

With MachineHack you can not only find qualified developers with hiring challenges but can also engage the developer community and your internal workforce by hosting hackathons.

Talent Assessment

Conduct Customized Online Assessments on our Powerful Cloud-based Platform, Secured with Best-in-class Proctoring

Research & Advisory

AIM Research produces a series of annual reports on AI & Data Science covering every aspect of the industry. Request Customised Reports & AIM Surveys for a study on topics of your interest.

Conferences & Events

Immerse yourself in AI and business conferences tailored to your role, designed to elevate your performance and empower you to accomplish your organization’s vital objectives.