Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Filter by Categories
About Article
Analyze Data
Archive
Best Practices
Better Outputs
Blog
Code Optimization
Code Quality
Command Line
Daily tips
Dashboard
Data Analysis & Manipulation
Data Engineer
Data Visualization
DataFrame
Delta Lake
DevOps
DuckDB
Environment Management
Feature Engineer
Git
Jupyter Notebook
LLM
LLM
Machine Learning
Machine Learning
Machine Learning & AI
Manage Data
MLOps
Natural Language Processing
NumPy
Pandas
Polars
PySpark
Python Tips
Python Utilities
Python Utilities
Scrape Data
SQL
Testing
Time Series
Tools
Visualization
Visualization & Reporting
Workflow & Automation
Workflow Automation

Testing

Check Conflicting Labels with Deepchecks

Sometimes, your data might have identical samples with different labels. This might be because the data was mislabeled.

It is good to identify these conflicting labels in your data before using the data to train your ML model. To check conflicting labels in your data, use deepchecks. 

In the example above, deepchecks identified that samples 0 and 1 have the same features but different labels. 

Link to deepchecks.

My previous tips on testing.
Favorite

Check Conflicting Labels with Deepchecks Read More »

Assign IDs to Pytest Parametrize

When using pytest parametrize, it can be difficult to understand the role of each test case. You can add the ids parameter to pytest parametrize to assign names to test cases.

In the code above, the first test case is shown as neg-neg instead of [-1–2]. This makes it easier for others to understand the roles of your test cases.

My previous tips on testing in Python.
Favorite

Assign IDs to Pytest Parametrize Read More »

Deepchecks + Weights & Biases: Test and Track Your ML Model and Data

Weight and Biases is a tool to track and monitor your ML experiments. deepchecks is a tool that allows you to create test suites for your ML models & data with ease.

The checks in a suite includes:

🔎 model performance

🔎 data integrity

🔎 distribution mismatches

and more.

Now you can track deepchecks suite’s results with Weights & Biases as shown above.

Here is how to create and track a test suite.
Favorite

Deepchecks + Weights & Biases: Test and Track Your ML Model and Data Read More »

pytest parametrize twice: Test All Possible Combinations of Two Sets of Parameters

If you want to test the combinations of two sets of parameters, writing all possible combinations can be time-consuming and is difficult to read. 

You can save your time by using pytest.mark.parametrize twice instead. From the output of pytest, we can see that all possible combinations of the given functions and inputs are tested.

My previous tips on testing.
Favorite

pytest parametrize twice: Test All Possible Combinations of Two Sets of Parameters Read More »

ipytest: Unit Tests in IPython Notebooks

It is important to create unit tests for your functions to make sure they work as you expected, even the experimental code in your Jupyter Notebook. However, it can be difficult to create unit tests in a notebook.

Luckily, ipytest allows you to run pytest inside the notebook environment. To use ipytest, simply add %%ipytest -qq inside the cell you want to run pytest.

Link to ipytest.

My previous tips on Jupyter Notebook.
Favorite

ipytest: Unit Tests in IPython Notebooks Read More »

Deepchecks: Check Category Mismatch Between Train and Test Set

Sometimes, it is important to know if your test set contains the same categories in the train set. If you want to check the category mismatch between the train and test set, use Deepchecks’s CategoryMismatchTrainTest.

In the example above, the result shows that there are 2 new categories in the test set. They are ‘d’ and ‘e’.

Link to Deepchecks.

My previous tips on testing.
Favorite

Deepchecks: Check Category Mismatch Between Train and Test Set Read More »

hypothesis: Property-based Testing in Python

If you want to test some properties or assumptions, it can be cumbersome to write a wide range of scenarios.

To automatically run your tests against a wide range of scenarios and find edge cases in your code that you would otherwise have missed, use hypothesis.

In the code above, I test if the addition of two floats is commutative. The test fails when either x or y is NaN. Now I can rewrite my code to make it more robust against these edge cases.

Learn more about hypothesis here.

Link to my previous tips about testing.
Favorite

hypothesis: Property-based Testing in Python Read More »

0
    0
    Your Cart
    Your cart is empty
    Scroll to Top

    Work with Khuyen Tran

    Work with Khuyen Tran