A function should do only one task, not multiple tasks. The function process_data
tries to do multiple tasks such as adding new features, adding one, and taking a sum of all columns. Using comments helps explain each block of code, but it takes a lot of work to keep the comments up-to-date. It is also difficult to test each unit of code inside a function.

A better practice is to split the function process_data
into smaller functions that do only one thing. In the code below, I split the function process_data
into 4 different functions and apply these functions to a pandas DataFrame in order using pipe
.
