...
Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Filter by Categories
About Article
Analyze Data
Archive
Best Practices
Better Outputs
Blog
Code Optimization
Code Quality
Command Line
Daily tips
Dashboard
Data Analysis & Manipulation
Data Engineer
Data Visualization
DataFrame
Delta Lake
DevOps
DuckDB
Environment Management
Feature Engineer
Git
Jupyter Notebook
LLM
LLM Tools
Machine Learning
Machine Learning & AI
Machine Learning Tools
Manage Data
MLOps
Natural Language Processing
NumPy
Pandas
Polars
PySpark
Python Helpers
Python Tips
Python Utilities
Scrape Data
SQL
Testing
Time Series
Tools
Visualization
Visualization & Reporting
Workflow & Automation
Workflow Automation

Doctest: Keeping Python Docstrings Accurate and Relevant

Table of Contents

Doctest: Keeping Python Docstrings Accurate and Relevant

Docstring examples clarify function purpose and use, but may become outdated as code changes.

Python’s doctest module solves this by automatically testing docstring examples, ensuring they remain accurate as the code evolves.

import pandas as pd

def check_numerical_columns(df: pd.DataFrame, columns: list[str]) -> list[str]:
    """
    Examples
    --------
    >>> import pandas as pd
    >>> df = pd.DataFrame({"A": [1, 2], "B": [4, 5], "C": ["a", "b"]})
    >>> check_numerical_columns(df, ["A", "B"])
    ['A', 'B']
    >>> check_numerical_columns(df, ["A", "C"])
    Traceback (most recent call last):
    ...
    TypeError: Some of the variables are not numerical. Please cast them as numerical before using this transformer.
    """
    if len(df[columns].select_dtypes(exclude="number").columns) > 0:
        raise TypeError(
            "Some of the variables are not numerical. Please cast them as "
            "numerical before using this transformer."
        )

    return columns

To test the docstrings, add these lines at the end of your module:

if __name__ == "__main__":
    import doctest
    doctest.testmod()

Now, when you run the script, doctest will verify all interactive examples:

$ python src/process_data.py

If there’s no output, all examples worked as expected.

For a detailed log of the tests, use the -v flag:

$ python src/process_data.py -v
Trying:
    import pandas as pd
Expecting nothing
ok
Trying:
    df = pd.DataFrame({"A": [1, 2, 3], "B": [4, 5, 6], "C": ["a", "b", "c"]})
Expecting nothing
ok
Trying:
    check_numerical_columns(df, ["A", "B"])
Expecting:
    ['A', 'B']
ok
Trying:
    check_numerical_columns(df, ["A", "C"])
Expecting:
    Traceback (most recent call last):
    ...
    TypeError: Some of the variables are not numerical. Please cast them as numerical before using this transformer.
ok

This will show you exactly which parts of your examples are being tested and their outcomes.

Leave a Comment

Your email address will not be published. Required fields are marked *

Add Your Heading Text Here

0
    0
    Your Cart
    Your cart is empty
    Scroll to Top

    Work with Khuyen Tran

    Work with Khuyen Tran

    Seraphinite AcceleratorOptimized by Seraphinite Accelerator
    Turns on site high speed to be attractive for people and search engines.