Repeatedly calling functions with some fixed arguments can lead to redundant code and reduced readability, causing unnecessary repetition throughout your codebase. In this article, we will explore how to simplify your code using functools.partial
.
The Problem
Let’s consider an example where we have a DataFrame with salary, bonus, and revenue columns, and we want to perform quartile binning on each column.
import pandas as pd
df = pd.DataFrame({
'salary': [45000, 75000, 125000, 85000],
'bonus': [5000, 15000, 25000, 10000],
'revenue': [150000, 280000, 420000, 310000]
})
processed_df = df.copy()
# Repetitive binning operations
processed_df['salary_level'] = pd.qcut(processed_df['salary'], q=4, labels=['Q1', 'Q2', 'Q3', 'Q4'])
processed_df['bonus_level'] = pd.qcut(processed_df['bonus'], q=4, labels=['Q1', 'Q2', 'Q3', 'Q4'])
processed_df['revenue_level'] = pd.qcut(processed_df['revenue'], q=4, labels=['Q1', 'Q2', 'Q3', 'Q4'])
processed_df
salary | bonus | revenue | salary_level | bonus_level | revenue_level | |
---|---|---|---|---|---|---|
0 | 45000 | 5000 | 150000 | Q1 | Q1 | Q1 |
1 | 75000 | 15000 | 280000 | Q2 | Q3 | Q2 |
2 | 125000 | 25000 | 420000 | Q4 | Q4 | Q4 |
3 | 85000 | 10000 | 310000 | Q3 | Q2 | Q3 |
This code is repetitive and hard to maintain. If we want to change the binning strategy, we have to modify it in multiple places.
The Solution
functools.partial
is a higher-order function that allows us to create new function variations with pre-set arguments. We can use it to simplify our code and make it more maintainable.
from functools import partial
processed_df = df.copy()
# Create a standardized quartile binning function
quartile_bin = partial(pd.qcut, q=4, labels=["Q1", "Q2", "Q3", "Q4"])
# Apply the binning function consistently
processed_df["salary_level"] = quartile_bin(processed_df["salary"])
processed_df["bonus_level"] = quartile_bin(processed_df["bonus"])
processed_df["revenue_level"] = quartile_bin(processed_dfdf["revenue"])
processed_df
salary | bonus | revenue | salary_level | bonus_level | revenue_level | |
---|---|---|---|---|---|---|
0 | 45000 | 5000 | 150000 | Q1 | Q1 | Q1 |
1 | 75000 | 15000 | 280000 | Q2 | Q3 | Q2 |
2 | 125000 | 25000 | 420000 | Q4 | Q4 | Q4 |
3 | 85000 | 10000 | 310000 | Q3 | Q2 | Q3 |
In this example, partial
creates a standardized binning function with pre-set parameters for the number of quantiles and their labels. This ensures consistent binning across different columns.
Changing the Binning Strategy
If we need to change the binning strategy, we only need to modify it in one place.
processed_df = df.copy()
# Easy to create different binning strategies
quintile_bin = partial(pd.qcut, q=5, labels=["Bottom", "Low", "Mid", "High", "Top"])
processed_df["salary_level"] = quintile_bin(processed_df["salary"])
processed_df["bonus_level"] = quintile_bin(processed_df["bonus"])
processed_df["revenue_level"] = quintile_bin(processed_df["revenue"])
processed_df
salary | bonus | revenue | salary_level | bonus_level | revenue_level | |
---|---|---|---|---|---|---|
0 | 45000 | 5000 | 150000 | Bottom | Bottom | Bottom |
1 | 75000 | 15000 | 280000 | Low | High | Low |
2 | 125000 | 25000 | 420000 | Top | Top | Top |
3 | 85000 | 10000 | 310000 | High | Low | High |
By using functools.partial
, we have simplified our code and made it more maintainable. We can easily create different binning strategies and apply them consistently across our DataFrame.