map and format: Insert a String into a Pandas series
If you want to insert a string into all values of a pandas Series, use pandas map and format.
map and format: Insert a String into a Pandas series Read More »
If you want to insert a string into all values of a pandas Series, use pandas map and format.
map and format: Insert a String into a Pandas series Read More »
If you want to filter columns of a pandas DataFrame based on characters in their names, use DataFrame.filter. In the example above, we only choose the columns that contain the word “cat”.
df.filter: Filter Columns Based on a Subset of Their Names Read More »
If you want to bin a column’s values into intervals that contain roughly the same number of elements, use pandas.qcut.
In the example below, the values of “a” are separated into 3 intervals, each of which contains 2 elements.
pandas.qcut: Bin a DataFrame’s Values into Equal-Sized Intervals Read More »
Sometimes, you might want to include a table in a markdown, such as GitHub README. If you want to print a DataFrame in markdown format, use to_markdown().
to_markdown: Print a DataFrame in Markdown Format Read More »
It is common to use groupby to get the statistics of rows in the same group such as count, mean, median, etc. If you want to group rows into a list instead, use lambda x: list(x).
Group DataFrame’s Rows into a List Using groupby Read More »
When using pandas pipe, you might want to check whether each pipeline transforms your pandas DataFrame correctly.
To automatically log the information of a pandas DataFrame after each pipeline, use the decorator sklego’s log_step.
Find more ways to customize your logging here.
Logging in Pandas Pipelines Read More »
Have you ever tried to make a copy of a DataFrame using =? You will not get a copy but a reference to the original DataFrame. Thus, changing the new DataFrame will also change the original DataFrame.
A better way to make a copy is to use df.copy(). Now, changing the copy will not affect the original DataFrame.
DataFrame.copy(): Make a Copy of a DataFrame Read More »
If you want to merge on DataFrame with another DataFrame based on the similarity between 2 columns, use df.merge.
df.merge: Merge DataFrame Based on Columns Read More »
If you want to get the count of elements in one column of a pandas DataFrame, use groupby and count.
If you want to get the size of groups composed of 2 or more columns, use groupby and size instead.
size: Compute the Size of Each Group Read More »
If you want to compute between rows or columns of two DataFrame, use corrwith.
pandas.DataFrame.corrwith: Compute Pairwise Correlation Between 2 DataFrame Read More »