Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Filter by Categories
About Article
Analyze Data
Archive
Best Practices
Better Outputs
Blog
Code Optimization
Code Quality
Command Line
Course
Daily tips
Dashboard
Data Analysis & Manipulation
Data Engineer
Data Visualization
DataFrame
Delta Lake
DevOps
DuckDB
Environment Management
Feature Engineer
Git
Jupyter Notebook
LLM
LLM Tools
Machine Learning
Machine Learning & AI
Machine Learning Tools
Manage Data
MLOps
Natural Language Processing
Newsletter Archive
NumPy
Pandas
Polars
PySpark
Python Helpers
Python Tips
Python Utilities
Scrape Data
SQL
Testing
Time Series
Tools
Visualization
Visualization & Reporting
Workflow & Automation
Workflow Automation

Newsletter Archive

Automated newsletter archive from Klaviyo campaigns

Newsletter #254: Pydantic v2.12: Skip Computed Fields During Serialization

📅 Today’s Picks

Pydantic v2.12: Skip Computed Fields During Serialization

Problem
By default, Pydantic’s model_dump() serializes computed fields alongside the base fields used to derive them.
This duplicates data and increases API response sizes.
Solution
Pydantic v2.12 adds the exclude_computed_fields parameter to model_dump().
This lets you keep computed fields for internal use while excluding them from API responses.

🧪 Run code

⭐ View GitHub

☕️ Weekly Finds

llm-council
[LLM]
– Query multiple LLMs in parallel, anonymize responses, and have them rank each other for better answers

skweak
[ML]
– Build NER models without labeled data using weak supervision for NLP tasks

wrapt
[Python Utils]
– Create transparent decorators, wrappers, and monkey patches in Python

Looking for a specific tool? Explore 70+ Python tools →

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

.codecut-subscribe-form .codecut-input {
background: #2F2D2E !important;
border: 1px solid #72BEFA !important;
color: #FFFFFF !important;
}
.codecut-subscribe-form .codecut-input::placeholder {
color: #999999 !important;
}
.codecut-subscribe-form .codecut-subscribe-btn {
background: #72BEFA !important;
color: #2F2D2E !important;
}
.codecut-subscribe-form .codecut-subscribe-btn:hover {
background: #5aa8e8 !important;
}

.codecut-subscribe-form {
max-width: 650px;
display: flex;
flex-direction: column;
gap: 8px;
}
.codecut-input {
-webkit-appearance: none;
-moz-appearance: none;
appearance: none;
background: #FFFFFF;
border-radius: 8px !important;
padding: 8px 12px;
font-family: ‘Comfortaa’, sans-serif !important;
font-size: 14px !important;
color: #333333;
border: none !important;
outline: none;
width: 100%;
box-sizing: border-box;
}
input[type=”email”].codecut-input {
border-radius: 8px !important;
}
.codecut-input::placeholder {
color: #666666;
}
.codecut-email-row {
display: flex;
align-items: stretch;
height: 36px;
gap: 8px;
}
.codecut-email-row .codecut-input {
flex: 1;
}
.codecut-subscribe-btn {
background: #72BEFA;
color: #2F2D2E;
border: none;
border-radius: 8px;
padding: 8px 14px;
font-family: ‘Comfortaa’, sans-serif;
font-size: 14px;
font-weight: 500;
cursor: pointer;
text-decoration: none;
display: flex;
align-items: center;
justify-content: center;
transition: background 0.3s ease;
}
.codecut-subscribe-btn:hover {
background: #5aa8e8;
}
.codecut-subscribe-btn:disabled {
background: #999;
cursor: not-allowed;
}
.codecut-message {
font-family: ‘Comfortaa’, sans-serif;
font-size: 12px;
padding: 8px;
border-radius: 6px;
display: none;
}
.codecut-message.success {
background: #d4edda;
color: #155724;
display: block;
}
@media (max-width: 480px) {
.codecut-email-row {
flex-direction: column;
height: auto;
gap: 8px;
}
.codecut-input {
border-radius: 8px;
height: 36px;
}
.codecut-subscribe-btn {
width: 100%;
text-align: center;
border-radius: 8px;
height: 36px;
}
}

Subscribe

Newsletter #254: Pydantic v2.12: Skip Computed Fields During Serialization Read More »

Newsletter #253: Docling: Auto-Annotate PDF Images Locally

📅 Today’s Picks

Docling: Auto-Annotate PDF Images Locally

Problem
Images in PDFs like charts, diagrams, and figures are invisible to search and analysis. Manually writing descriptions for hundreds of figures is impractical.
You could use cloud APIs like Gemini or ChatGPT, but that means API costs at scale and your documents leaving your infrastructure.
Solution
Docling runs local vision language models (Granite Vision, SmolVLM) to automatically generate descriptive annotations for every picture in your documents, keeping data private.
Key benefits:

Privacy: Data stays local, works offline
Cost: No per-image API fees
Flexibility: Customizable prompts, any HuggingFace model

📖 View Full Article

🧪 Run code

⭐ View GitHub

Rembg: Remove Image Backgrounds in 2 Lines of Python

Problem
Removing backgrounds from images typically requires Photoshop, online tools, or AI assistants like ChatGPT.
But these options come with subscription costs, upload limits, or privacy concerns with your images on external servers.
Solution
Rembg uses AI models to remove backgrounds locally with just 2 lines of Python.
It’s also open source and compatible with common Python imaging libraries.

🧪 Run code

⭐ View GitHub

☕️ Weekly Finds

label-studio
[MLOps]
– Multi-type data labeling and annotation tool with standardized output format

reflex
[Python Utils]
– Build full-stack web apps in pure Python – no JavaScript required

TradingAgents
[LLM]
– Multi-agent LLM financial trading framework

Looking for a specific tool? Explore 70+ Python tools →

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

.codecut-subscribe-form .codecut-input {
background: #2F2D2E !important;
border: 1px solid #72BEFA !important;
color: #FFFFFF !important;
}
.codecut-subscribe-form .codecut-input::placeholder {
color: #999999 !important;
}
.codecut-subscribe-form .codecut-subscribe-btn {
background: #72BEFA !important;
color: #2F2D2E !important;
}
.codecut-subscribe-form .codecut-subscribe-btn:hover {
background: #5aa8e8 !important;
}

.codecut-subscribe-form {
max-width: 650px;
display: flex;
flex-direction: column;
gap: 8px;
}
.codecut-input {
-webkit-appearance: none;
-moz-appearance: none;
appearance: none;
background: #FFFFFF;
border-radius: 8px !important;
padding: 8px 12px;
font-family: ‘Comfortaa’, sans-serif !important;
font-size: 14px !important;
color: #333333;
border: none !important;
outline: none;
width: 100%;
box-sizing: border-box;
}
input[type=”email”].codecut-input {
border-radius: 8px !important;
}
.codecut-input::placeholder {
color: #666666;
}
.codecut-email-row {
display: flex;
align-items: stretch;
height: 36px;
gap: 8px;
}
.codecut-email-row .codecut-input {
flex: 1;
}
.codecut-subscribe-btn {
background: #72BEFA;
color: #2F2D2E;
border: none;
border-radius: 8px;
padding: 8px 14px;
font-family: ‘Comfortaa’, sans-serif;
font-size: 14px;
font-weight: 500;
cursor: pointer;
text-decoration: none;
display: flex;
align-items: center;
justify-content: center;
transition: background 0.3s ease;
}
.codecut-subscribe-btn:hover {
background: #5aa8e8;
}
.codecut-subscribe-btn:disabled {
background: #999;
cursor: not-allowed;
}
.codecut-message {
font-family: ‘Comfortaa’, sans-serif;
font-size: 12px;
padding: 8px;
border-radius: 6px;
display: none;
}
.codecut-message.success {
background: #d4edda;
color: #155724;
display: block;
}
@media (max-width: 480px) {
.codecut-email-row {
flex-direction: column;
height: auto;
gap: 8px;
}
.codecut-input {
border-radius: 8px;
height: 36px;
}
.codecut-subscribe-btn {
width: 100%;
text-align: center;
border-radius: 8px;
height: 36px;
}
}

Subscribe

Newsletter #253: Docling: Auto-Annotate PDF Images Locally Read More »

Newsletter #252: Build Fast Recommendations with Annoy’s Memory-Mapped Indexes

📅 Today’s Picks

Build Fast Recommendations with Annoy’s Memory-Mapped Indexes

Problem
sklearn loads all your item vectors into memory and compares your search vector against every single item in your dataset.
This can take seconds or minutes when you have millions of items.
Solution
Annoy (Approximate Nearest Neighbors Oh Yeah), built by Spotify, speeds up similarity search by organizing your vectors into a searchable tree structure.
How it works:

Pre-builds indexes with “build(n_trees)”, creating multiple trees by recursively splitting your vector space with random hyperplanes
Traverses tree splits to find the n nearest neighbors using “get_nns_by_item(i, n)”
Checks only items in the final region instead of scanning everything

As a result, you can query millions of items in milliseconds instead of seconds.

🧪 Run code

⭐ View GitHub

Build Reliable DataFrame Tests with assert_frame_equal

Problem
Testing numerical code with regular assertions can lead to false failures from floating-point precision.
Your perfectly correct function fails tests because 0.1 + 0.2 doesn’t exactly equal 0.3 in computer arithmetic.
Solution
Use numpy.testing and pandas.testing utilities for robust numerical comparisons.
Key approaches:

assert_array_almost_equal for NumPy arrays with decimal precision control
pd.testing.assert_frame_equal for DataFrame comparisons with tolerance
Handle floating-point arithmetic limitations properly
Get reliable test results for numerical data processing

Professional data science requires proper numerical testing methods.

📖 Learn more

🧪 Run code

☕️ Weekly Finds

sympy
[Python Utils]
– A computer algebra system written in pure Python for symbolic mathematics

qdrant
[LLM]
– High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI

mindsdb
[ML]
– Federated query engine for AI – connect to hundreds of data sources and generate intelligent responses using built-in agents

Looking for a specific tool? Explore 70+ Python tools →

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

.codecut-subscribe-form .codecut-input {
background: #2F2D2E !important;
border: 1px solid #72BEFA !important;
color: #FFFFFF !important;
}
.codecut-subscribe-form .codecut-input::placeholder {
color: #999999 !important;
}
.codecut-subscribe-form .codecut-subscribe-btn {
background: #72BEFA !important;
color: #2F2D2E !important;
}
.codecut-subscribe-form .codecut-subscribe-btn:hover {
background: #5aa8e8 !important;
}

.codecut-subscribe-form {
max-width: 650px;
display: flex;
flex-direction: column;
gap: 8px;
}
.codecut-input {
-webkit-appearance: none;
-moz-appearance: none;
appearance: none;
background: #FFFFFF;
border-radius: 8px !important;
padding: 8px 12px;
font-family: ‘Comfortaa’, sans-serif !important;
font-size: 14px !important;
color: #333333;
border: none !important;
outline: none;
width: 100%;
box-sizing: border-box;
}
input[type=”email”].codecut-input {
border-radius: 8px !important;
}
.codecut-input::placeholder {
color: #666666;
}
.codecut-email-row {
display: flex;
align-items: stretch;
height: 36px;
gap: 8px;
}
.codecut-email-row .codecut-input {
flex: 1;
}
.codecut-subscribe-btn {
background: #72BEFA;
color: #2F2D2E;
border: none;
border-radius: 8px;
padding: 8px 14px;
font-family: ‘Comfortaa’, sans-serif;
font-size: 14px;
font-weight: 500;
cursor: pointer;
text-decoration: none;
display: flex;
align-items: center;
justify-content: center;
transition: background 0.3s ease;
}
.codecut-subscribe-btn:hover {
background: #5aa8e8;
}
.codecut-subscribe-btn:disabled {
background: #999;
cursor: not-allowed;
}
.codecut-message {
font-family: ‘Comfortaa’, sans-serif;
font-size: 12px;
padding: 8px;
border-radius: 6px;
display: none;
}
.codecut-message.success {
background: #d4edda;
color: #155724;
display: block;
}
@media (max-width: 480px) {
.codecut-email-row {
flex-direction: column;
height: auto;
gap: 8px;
}
.codecut-input {
border-radius: 8px;
height: 36px;
}
.codecut-subscribe-btn {
width: 100%;
text-align: center;
border-radius: 8px;
height: 36px;
}
}

Subscribe

Newsletter #252: Build Fast Recommendations with Annoy’s Memory-Mapped Indexes Read More »

Newsletter #251: PySpark 4.0: Native Plotting API for DataFrames

📅 Today’s Picks

PySpark 4.0: Native Plotting API for DataFrames

Problem
Visualizing PySpark DataFrames typically requires converting to Pandas first, adding memory overhead and extra processing steps.
Solution
PySpark 4.0 adds native Plotly-powered plotting, enabling direct .plot() calls on DataFrames without Pandas conversion.

📖 View Full Article

🧪 Run code

⭐ View GitHub

☕️ Weekly Finds

rembg
[Python Utils]
– Rembg is a tool to remove images background

pyupgrade
[Python Utils]
– A tool (and pre-commit hook) to automatically upgrade syntax for newer versions of the language

py-shiny
[Data Viz]
– Shiny for Python is the best way to build fast, beautiful web applications in Python

Looking for a specific tool? Explore 70+ Python tools →

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

.codecut-subscribe-form .codecut-input {
background: #2F2D2E !important;
border: 1px solid #72BEFA !important;
color: #FFFFFF !important;
}
.codecut-subscribe-form .codecut-input::placeholder {
color: #999999 !important;
}
.codecut-subscribe-form .codecut-subscribe-btn {
background: #72BEFA !important;
color: #2F2D2E !important;
}
.codecut-subscribe-form .codecut-subscribe-btn:hover {
background: #5aa8e8 !important;
}

.codecut-subscribe-form {
max-width: 650px;
display: flex;
flex-direction: column;
gap: 8px;
}
.codecut-input {
-webkit-appearance: none;
-moz-appearance: none;
appearance: none;
background: #FFFFFF;
border-radius: 8px !important;
padding: 8px 12px;
font-family: ‘Comfortaa’, sans-serif !important;
font-size: 14px !important;
color: #333333;
border: none !important;
outline: none;
width: 100%;
box-sizing: border-box;
}
input[type=”email”].codecut-input {
border-radius: 8px !important;
}
.codecut-input::placeholder {
color: #666666;
}
.codecut-email-row {
display: flex;
align-items: stretch;
height: 36px;
gap: 8px;
}
.codecut-email-row .codecut-input {
flex: 1;
}
.codecut-subscribe-btn {
background: #72BEFA;
color: #2F2D2E;
border: none;
border-radius: 8px;
padding: 8px 14px;
font-family: ‘Comfortaa’, sans-serif;
font-size: 14px;
font-weight: 500;
cursor: pointer;
text-decoration: none;
display: flex;
align-items: center;
justify-content: center;
transition: background 0.3s ease;
}
.codecut-subscribe-btn:hover {
background: #5aa8e8;
}
.codecut-subscribe-btn:disabled {
background: #999;
cursor: not-allowed;
}
.codecut-message {
font-family: ‘Comfortaa’, sans-serif;
font-size: 12px;
padding: 8px;
border-radius: 6px;
display: none;
}
.codecut-message.success {
background: #d4edda;
color: #155724;
display: block;
}
@media (max-width: 480px) {
.codecut-email-row {
flex-direction: column;
height: auto;
gap: 8px;
}
.codecut-input {
border-radius: 8px;
height: 36px;
}
.codecut-subscribe-btn {
width: 100%;
text-align: center;
border-radius: 8px;
height: 36px;
}
}

Subscribe

Newsletter #251: PySpark 4.0: Native Plotting API for DataFrames Read More »

Newsletter #250: Extract Text from Any Document Format with Docling

📅 Today’s Picks

Build Schema-Flexible Pipelines with Polars Selectors

Problem
Hard-coding column names can break your code when the schema changes.
When columns of the same type are added or removed, you must update your code manually.
Solution
Polars col() function accepts data types to select all matching columns automatically.
This keeps your code flexible and robust to schema changes.

📖 View Full Article

🧪 Run code

⭐ View GitHub

Extract Text from Any Document Format with Docling

Problem
Have you ever needed to pull text from PDFs, Word files, slide decks, or images for a project? Writing a different parser for each format is slow and error-prone.
Solution
Docling’s DocumentConverter takes care of that by detecting the file type and applying the right parsing method for PDF, DOCX, PPTX, HTML, and images.
Other features of Docling:

AI-powered image descriptions for searchable diagrams
Export to pandas DataFrames, JSON, or Markdown
Structure-preserving output optimized for RAG pipelines
Built-in chunking strategies for vector databases
Parallel processing handles large document batches efficiently

📖 View Full Article

🧪 Run code

⭐ View GitHub

☕️ Weekly Finds

evals
[LLM]
– Framework for evaluating large language models (LLMs) or systems built using LLMs with existing registry of evals and ability to write custom evals

sklearn-bayes
[ML]
– Python package for Bayesian Machine Learning with scikit-learn API

databonsai
[Data Processing]
– Python library that uses LLMs to perform data cleaning tasks for categorization, transformation and curation

Looking for a specific tool? Explore 70+ Python tools →

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

.codecut-subscribe-form .codecut-input {
background: #2F2D2E !important;
border: 1px solid #72BEFA !important;
color: #FFFFFF !important;
}
.codecut-subscribe-form .codecut-input::placeholder {
color: #999999 !important;
}
.codecut-subscribe-form .codecut-subscribe-btn {
background: #72BEFA !important;
color: #2F2D2E !important;
}
.codecut-subscribe-form .codecut-subscribe-btn:hover {
background: #5aa8e8 !important;
}

.codecut-subscribe-form {
max-width: 650px;
display: flex;
flex-direction: column;
gap: 8px;
}
.codecut-input {
-webkit-appearance: none;
-moz-appearance: none;
appearance: none;
background: #FFFFFF;
border-radius: 8px !important;
padding: 8px 12px;
font-family: ‘Comfortaa’, sans-serif !important;
font-size: 14px !important;
color: #333333;
border: none !important;
outline: none;
width: 100%;
box-sizing: border-box;
}
input[type=”email”].codecut-input {
border-radius: 8px !important;
}
.codecut-input::placeholder {
color: #666666;
}
.codecut-email-row {
display: flex;
align-items: stretch;
height: 36px;
gap: 8px;
}
.codecut-email-row .codecut-input {
flex: 1;
}
.codecut-subscribe-btn {
background: #72BEFA;
color: #2F2D2E;
border: none;
border-radius: 8px;
padding: 8px 14px;
font-family: ‘Comfortaa’, sans-serif;
font-size: 14px;
font-weight: 500;
cursor: pointer;
text-decoration: none;
display: flex;
align-items: center;
justify-content: center;
transition: background 0.3s ease;
}
.codecut-subscribe-btn:hover {
background: #5aa8e8;
}
.codecut-subscribe-btn:disabled {
background: #999;
cursor: not-allowed;
}
.codecut-message {
font-family: ‘Comfortaa’, sans-serif;
font-size: 12px;
padding: 8px;
border-radius: 6px;
display: none;
}
.codecut-message.success {
background: #d4edda;
color: #155724;
display: block;
}
@media (max-width: 480px) {
.codecut-email-row {
flex-direction: column;
height: auto;
gap: 8px;
}
.codecut-input {
border-radius: 8px;
height: 36px;
}
.codecut-subscribe-btn {
width: 100%;
text-align: center;
border-radius: 8px;
height: 36px;
}
}

Subscribe

Newsletter #250: Extract Text from Any Document Format with Docling Read More »

Newsletter #249: prek: Faster, Leaner Pre-Commit Hooks (Rust-Powered)

📅 Today’s Picks

LangChain v1.0: Automate Tool Selection for Faster Agents

Problem
Agents with many tools waste tokens by sending all tool descriptions with every request.
This wastes tokens on irrelevant tool descriptions, making responses slower and more expensive.
Solution
LangChain v1.0 introduces LLMToolSelectorMiddleware that pre-filters relevant tools using a smaller model.
Key features:

Pre-filter tools using cheaper models like GPT-4o-mini
Limit tools sent to main agent (e.g., 3 most relevant)
Preserve critical tools with always_include parameter

📖 View Full Article

🧪 Run code

⭐ View GitHub

prek: Faster, Leaner Pre-Commit Hooks (Rust-Powered)

Problem
pre-commit is a framework for managing Git hooks that automatically run code quality checks before commits.
However, installing these hook environments (linters, formatters, etc.) can be slow and disk-intensive, especially in CI/CD pipelines where speed matters.
Solution
prek is a drop-in replacement for pre-commit that installs hook environments significantly faster while using 50% less disk space.
Built with Rust for maximum performance, prek reduces cache storage from 1.6GB to 810MB (benchmarked on Apache Airflow repository) without changing your workflow.
Key benefits:

Uses your existing .pre-commit-config.yaml files
Commands mirror pre-commit syntax (prek install-hooks, prek run)
Monorepo support with selector syntax for targeting specific projects or hooks
Install as a single binary with no dependencies

No configuration changes needed – just replace the command.

⭐ View GitHub

☕️ Weekly Finds

deepagents
[LLM]
– Build advanced AI agents with context isolation through sub-agent delegation. Features virtual file system for context offloading, specialized sub-agents with focused tool sets, and sophisticated agent architecture for real-world research and analysis tasks.

mcp-gateway
[MLOps]
– Docker MCP CLI plugin / MCP Gateway for production-grade AI agent stack. Enables multi-agent orchestration, intelligent interceptors, and enterprise security with Docker integration.

nbQA
[Python Utils]
– Run ruff, isort, pyupgrade, mypy, pylint, flake8, and more on Jupyter Notebooks. Command-line tool to run linters and formatters over Python code in Jupyter notebooks.

Looking for a specific tool? Explore 70+ Python tools →

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

.codecut-subscribe-form .codecut-input {
background: #2F2D2E !important;
border: 1px solid #72BEFA !important;
color: #FFFFFF !important;
}
.codecut-subscribe-form .codecut-input::placeholder {
color: #999999 !important;
}
.codecut-subscribe-form .codecut-subscribe-btn {
background: #72BEFA !important;
color: #2F2D2E !important;
}
.codecut-subscribe-form .codecut-subscribe-btn:hover {
background: #5aa8e8 !important;
}

.codecut-subscribe-form {
max-width: 650px;
display: flex;
flex-direction: column;
gap: 8px;
}
.codecut-input {
-webkit-appearance: none;
-moz-appearance: none;
appearance: none;
background: #FFFFFF;
border-radius: 8px !important;
padding: 8px 12px;
font-family: ‘Comfortaa’, sans-serif !important;
font-size: 14px !important;
color: #333333;
border: none !important;
outline: none;
width: 100%;
box-sizing: border-box;
}
input[type=”email”].codecut-input {
border-radius: 8px !important;
}
.codecut-input::placeholder {
color: #666666;
}
.codecut-email-row {
display: flex;
align-items: stretch;
height: 36px;
gap: 8px;
}
.codecut-email-row .codecut-input {
flex: 1;
}
.codecut-subscribe-btn {
background: #72BEFA;
color: #2F2D2E;
border: none;
border-radius: 8px;
padding: 8px 14px;
font-family: ‘Comfortaa’, sans-serif;
font-size: 14px;
font-weight: 500;
cursor: pointer;
text-decoration: none;
display: flex;
align-items: center;
justify-content: center;
transition: background 0.3s ease;
}
.codecut-subscribe-btn:hover {
background: #5aa8e8;
}
.codecut-subscribe-btn:disabled {
background: #999;
cursor: not-allowed;
}
.codecut-message {
font-family: ‘Comfortaa’, sans-serif;
font-size: 12px;
padding: 8px;
border-radius: 6px;
display: none;
}
.codecut-message.success {
background: #d4edda;
color: #155724;
display: block;
}
@media (max-width: 480px) {
.codecut-email-row {
flex-direction: column;
height: auto;
gap: 8px;
}
.codecut-input {
border-radius: 8px;
height: 36px;
}
.codecut-subscribe-btn {
width: 100%;
text-align: center;
border-radius: 8px;
height: 36px;
}
}

Subscribe

Newsletter #249: prek: Faster, Leaner Pre-Commit Hooks (Rust-Powered) Read More »

Code example: Build Mathematical Animations with Manim in Python

Newsletter #248: Build Mathematical Animations with Manim in Python – Fixed

📅 Today’s Picks

Build Mathematical Animations with Manim in Python

Problem
Static slides can only go so far when you’re explaining complex concepts.
Dynamic visuals make abstract ideas clearer, more engaging, and easier to understand.
Solution
Manim gives you the power to create professional mathematical animations in Python, just like the ones you see in 3Blue1Brown’s videos.
In the code below, Manim transforms equations into smooth visual steps:

Define equation steps using MathTex with LaTeX notation
Animate equation transformations with the Transform class
Control animation flow with play() and wait() methods
Render output with simple command: manim -p -ql script.py

📖 View Full Article

⭐ View GitHub

☕️ Weekly Finds

fast-langdetect
[Python Utils]
– 80x faster and 95% accurate language identification with Fasttext

FuncToWeb
[Python Utils]
– Transform any Python function into a web interface automatically

graphic-walker
[Data Viz]
– An open source alternative to Tableau for data exploration and visualization

Looking for a specific tool? Explore 70+ Python tools →

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

.codecut-subscribe-form .codecut-input {
background: #2F2D2E !important;
border: 1px solid #72BEFA !important;
color: #FFFFFF !important;
}
.codecut-subscribe-form .codecut-input::placeholder {
color: #999999 !important;
}
.codecut-subscribe-form .codecut-subscribe-btn {
background: #72BEFA !important;
color: #2F2D2E !important;
}
.codecut-subscribe-form .codecut-subscribe-btn:hover {
background: #5aa8e8 !important;
}

.codecut-subscribe-form {
max-width: 650px;
display: flex;
flex-direction: column;
gap: 8px;
}
.codecut-input {
-webkit-appearance: none;
-moz-appearance: none;
appearance: none;
background: #FFFFFF;
border-radius: 8px !important;
padding: 8px 12px;
font-family: ‘Comfortaa’, sans-serif !important;
font-size: 14px !important;
color: #333333;
border: none !important;
outline: none;
width: 100%;
box-sizing: border-box;
}
input[type=”email”].codecut-input {
border-radius: 8px !important;
}
.codecut-input::placeholder {
color: #666666;
}
.codecut-email-row {
display: flex;
align-items: stretch;
height: 36px;
gap: 8px;
}
.codecut-email-row .codecut-input {
flex: 1;
}
.codecut-subscribe-btn {
background: #72BEFA;
color: #2F2D2E;
border: none;
border-radius: 8px;
padding: 8px 14px;
font-family: ‘Comfortaa’, sans-serif;
font-size: 14px;
font-weight: 500;
cursor: pointer;
text-decoration: none;
display: flex;
align-items: center;
justify-content: center;
transition: background 0.3s ease;
}
.codecut-subscribe-btn:hover {
background: #5aa8e8;
}
.codecut-subscribe-btn:disabled {
background: #999;
cursor: not-allowed;
}
.codecut-message {
font-family: ‘Comfortaa’, sans-serif;
font-size: 12px;
padding: 8px;
border-radius: 6px;
display: none;
}
.codecut-message.success {
background: #d4edda;
color: #155724;
display: block;
}
@media (max-width: 480px) {
.codecut-email-row {
flex-direction: column;
height: auto;
gap: 8px;
}
.codecut-input {
border-radius: 8px;
height: 36px;
}
.codecut-subscribe-btn {
width: 100%;
text-align: center;
border-radius: 8px;
height: 36px;
}
}

Subscribe

Newsletter #248: Build Mathematical Animations with Manim in Python – Fixed Read More »

Newsletter #247: whenever: Simple Python Timezone Conversion

🤝 COLLABORATION

Build Safer APIs with Buf – Free Workshop
Building APIs is simple. Scaling them across teams and systems isn’t. Ensuring consistency, compatibility, and reliability quickly becomes a challenge as projects grow.
Buf provides a toolkit that makes working with Protocol Buffers faster, safer, and more consistent.
Join Buf for a live, one-hour workshop on building safer, more consistent APIs.
When: Nov 19, 2025 • 9 AM PDT | 12 PM EDT | 5 PM BST
What you’ll learn:

How Protobuf makes API development safer and simpler
API design best practices for real-world systems
How to extend Protobuf to data pipelines and streaming systems

Register for the workshop

📅 Today’s Picks

whenever: Simple Python Timezone Conversion

Problem
Adding 8 hours to 10pm shouldn’t give you the wrong morning time, but with Python’s datetime, it can.
The standard library fails during DST transitions, returning incorrect offsets when clocks change for daylight saving.
Solution
Whenever provides simple, explicit timezone conversion methods with clear semantics.
Key benefits:

DST-safe arithmetic with automatic offset adjustment
Type safety prevents naive/aware datetime bugs
Clean timezone conversions with .to_tz()
Nanosecond precision for deltas and timestamps
Pydantic integration for serialization

🧪 Run code

⭐ View GitHub

Build Readable Scatter Plots with adjustText Auto-Positioning

Problem
Text labels in matplotlib scatter plots frequently overlap with each other and data points, creating unreadable visualizations.
Manually repositioning each label to avoid overlaps is tedious and time-consuming.
Solution
adjustText automatically repositions labels to eliminate overlaps while connecting them to data points with arrows.
All you need is to collect your text objects and call adjust_text() with optional arrow styling.

🧪 Run code

⭐ View GitHub

📢 ANNOUNCEMENTS

Featured on LeanPub: Production-Ready Data Science
My book Production-Ready Data Science was featured on the LeanPub home page!
LeanPub is a leading platform for publishing and selling self-published technical books, so it’s truly an honor to see my work highlighted there.
The book shares everything I’ve learned about turning data science prototypes into reliable, production-ready systems, from managing dependencies to automating workflows.
Thank you to everyone who has purchased or shared it. Your support means everything.
The book is currently on sale for 58% off until November 16.

Get Your Copy Now (58% Off)

☕️ Weekly Finds

featuretools
[ML]
– An open source python library for automated feature engineering

datachain
[Data Processing]
– ETL, Analytics, Versioning for Unstructured Data – AI-data warehouse to enrich, transform and analyze data from cloud storages

logfire
[Python Utils]
– Uncomplicated Observability for Python and beyond – an observability platform built on OpenTelemetry from the team behind Pydantic

Looking for a specific tool? Explore 70+ Python tools →

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

.codecut-subscribe-form .codecut-input {
background: #2F2D2E !important;
border: 1px solid #72BEFA !important;
color: #FFFFFF !important;
}
.codecut-subscribe-form .codecut-input::placeholder {
color: #999999 !important;
}
.codecut-subscribe-form .codecut-subscribe-btn {
background: #72BEFA !important;
color: #2F2D2E !important;
}
.codecut-subscribe-form .codecut-subscribe-btn:hover {
background: #5aa8e8 !important;
}

.codecut-subscribe-form {
max-width: 650px;
display: flex;
flex-direction: column;
gap: 8px;
}
.codecut-input {
-webkit-appearance: none;
-moz-appearance: none;
appearance: none;
background: #FFFFFF;
border-radius: 8px !important;
padding: 8px 12px;
font-family: ‘Comfortaa’, sans-serif !important;
font-size: 14px !important;
color: #333333;
border: none !important;
outline: none;
width: 100%;
box-sizing: border-box;
}
input[type=”email”].codecut-input {
border-radius: 8px !important;
}
.codecut-input::placeholder {
color: #666666;
}
.codecut-email-row {
display: flex;
align-items: stretch;
height: 36px;
gap: 8px;
}
.codecut-email-row .codecut-input {
flex: 1;
}
.codecut-subscribe-btn {
background: #72BEFA;
color: #2F2D2E;
border: none;
border-radius: 8px;
padding: 8px 14px;
font-family: ‘Comfortaa’, sans-serif;
font-size: 14px;
font-weight: 500;
cursor: pointer;
text-decoration: none;
display: flex;
align-items: center;
justify-content: center;
transition: background 0.3s ease;
}
.codecut-subscribe-btn:hover {
background: #5aa8e8;
}
.codecut-subscribe-btn:disabled {
background: #999;
cursor: not-allowed;
}
.codecut-message {
font-family: ‘Comfortaa’, sans-serif;
font-size: 12px;
padding: 8px;
border-radius: 6px;
display: none;
}
.codecut-message.success {
background: #d4edda;
color: #155724;
display: block;
}
@media (max-width: 480px) {
.codecut-email-row {
flex-direction: column;
height: auto;
gap: 8px;
}
.codecut-input {
border-radius: 8px;
height: 36px;
}
.codecut-subscribe-btn {
width: 100%;
text-align: center;
border-radius: 8px;
height: 36px;
}
}

Subscribe

Newsletter #247: whenever: Simple Python Timezone Conversion Read More »

Newsletter #246: Faster Polars Queries with Programmatic Expressions

🤝 COLLABORATION

Build Safer APIs with Buf – Free Workshop
Building APIs is simple. Scaling them across teams and systems isn’t. Ensuring consistency, compatibility, and reliability quickly becomes a challenge as projects grow.
Buf provides a toolkit that makes working with Protocol Buffers faster, safer, and more consistent.
Join Buf for a live, one-hour workshop on building safer, more consistent APIs.
When: Nov 19, 2025 • 9 AM PDT | 12 PM EDT | 5 PM BST
What you’ll learn:

How Protobuf makes API development safer and simpler
API design best practices for real-world systems
How to extend Protobuf to data pipelines and streaming systems

Register for the workshop

📅 Today’s Picks

Faster Polars Queries with Programmatic Expressions

Problem
When you want to use for loops to apply similar transformations, each Polars with_columns() call processes sequentially.
This prevents the optimizer from seeing the full computation plan.
Solution
Instead, generate all Polars expressions programmatically before applying them together.
This enables Polars to:

See the complete computation plan upfront
Optimize across all expressions simultaneously
Parallelize operations across CPU cores

📖 View Full Article

🧪 Run code

⭐ View GitHub

itertools.chain: Merge Lists Without Intermediate Copies

Problem
Standard list merging with extend() or concatenation creates intermediate copies.
This memory overhead becomes significant when processing large lists.
Solution
itertools.chain() lazily merges multiple iterables without creating intermediate lists.

📖 View Full Article

🧪 Run code

☕️ Weekly Finds

fiftyone
[ML]
– Open-source tool for building high-quality datasets and computer vision models

llama-stack
[LLM]
– Composable building blocks to build Llama Apps with unified API for inference, RAG, agents, and more

grip
[Python Utils]
– Preview GitHub README.md files locally before committing them using GitHub’s markdown API

Looking for a specific tool? Explore 70+ Python tools →

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

.codecut-subscribe-form .codecut-input {
background: #2F2D2E !important;
border: 1px solid #72BEFA !important;
color: #FFFFFF !important;
}
.codecut-subscribe-form .codecut-input::placeholder {
color: #999999 !important;
}
.codecut-subscribe-form .codecut-subscribe-btn {
background: #72BEFA !important;
color: #2F2D2E !important;
}
.codecut-subscribe-form .codecut-subscribe-btn:hover {
background: #5aa8e8 !important;
}

.codecut-subscribe-form {
max-width: 650px;
display: flex;
flex-direction: column;
gap: 8px;
}
.codecut-input {
-webkit-appearance: none;
-moz-appearance: none;
appearance: none;
background: #FFFFFF;
border-radius: 8px !important;
padding: 8px 12px;
font-family: ‘Comfortaa’, sans-serif !important;
font-size: 14px !important;
color: #333333;
border: none !important;
outline: none;
width: 100%;
box-sizing: border-box;
}
input[type=”email”].codecut-input {
border-radius: 8px !important;
}
.codecut-input::placeholder {
color: #666666;
}
.codecut-email-row {
display: flex;
align-items: stretch;
height: 36px;
gap: 8px;
}
.codecut-email-row .codecut-input {
flex: 1;
}
.codecut-subscribe-btn {
background: #72BEFA;
color: #2F2D2E;
border: none;
border-radius: 8px;
padding: 8px 14px;
font-family: ‘Comfortaa’, sans-serif;
font-size: 14px;
font-weight: 500;
cursor: pointer;
text-decoration: none;
display: flex;
align-items: center;
justify-content: center;
transition: background 0.3s ease;
}
.codecut-subscribe-btn:hover {
background: #5aa8e8;
}
.codecut-subscribe-btn:disabled {
background: #999;
cursor: not-allowed;
}
.codecut-message {
font-family: ‘Comfortaa’, sans-serif;
font-size: 12px;
padding: 8px;
border-radius: 6px;
display: none;
}
.codecut-message.success {
background: #d4edda;
color: #155724;
display: block;
}
@media (max-width: 480px) {
.codecut-email-row {
flex-direction: column;
height: auto;
gap: 8px;
}
.codecut-input {
border-radius: 8px;
height: 36px;
}
.codecut-subscribe-btn {
width: 100%;
text-align: center;
border-radius: 8px;
height: 36px;
}
}

Subscribe

Newsletter #246: Faster Polars Queries with Programmatic Expressions Read More »

Newsletter #245: PySpark: Avoid Double Conversions with applyInArrow

📅 Today’s Picks

PySpark: Avoid Double Conversions with applyInArrow

Problem
applyInPandas lets you apply Pandas functions in PySpark by converting data from Arrow→Pandas→Arrow for each operation.
This double conversion adds serialization overhead that slows down your transformations.
Solution
applyInArrow (introduced in PySpark 4.0.0) works directly with PyArrow tables, eliminating the Pandas conversion step entirely.
This keeps data in Arrow’s columnar format throughout the pipeline, making operations significantly faster.
Trade-off: PyArrow’s syntax is less intuitive than Pandas, but it’s worth it if you’re processing large datasets where performance matters.

📖 View Full Article

🧪 Run code

⭐ View GitHub

☕️ Weekly Finds

causal-learn
[ML]
– Python package for causal discovery that implements both classical and state-of-the-art causal discovery algorithms

POT
[ML]
– Python Optimal Transport library providing solvers for optimization problems related to signal, image processing and machine learning

qdrant
[MLOps]
– High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI

Looking for a specific tool? Explore 70+ Python tools →

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

.codecut-subscribe-form .codecut-input {
background: #2F2D2E !important;
border: 1px solid #72BEFA !important;
color: #FFFFFF !important;
}
.codecut-subscribe-form .codecut-input::placeholder {
color: #999999 !important;
}
.codecut-subscribe-form .codecut-subscribe-btn {
background: #72BEFA !important;
color: #2F2D2E !important;
}
.codecut-subscribe-form .codecut-subscribe-btn:hover {
background: #5aa8e8 !important;
}

.codecut-subscribe-form {
max-width: 650px;
display: flex;
flex-direction: column;
gap: 8px;
}
.codecut-input {
-webkit-appearance: none;
-moz-appearance: none;
appearance: none;
background: #FFFFFF;
border-radius: 8px !important;
padding: 8px 12px;
font-family: ‘Comfortaa’, sans-serif !important;
font-size: 14px !important;
color: #333333;
border: none !important;
outline: none;
width: 100%;
box-sizing: border-box;
}
input[type=”email”].codecut-input {
border-radius: 8px !important;
}
.codecut-input::placeholder {
color: #666666;
}
.codecut-email-row {
display: flex;
align-items: stretch;
height: 36px;
gap: 8px;
}
.codecut-email-row .codecut-input {
flex: 1;
}
.codecut-subscribe-btn {
background: #72BEFA;
color: #2F2D2E;
border: none;
border-radius: 8px;
padding: 8px 14px;
font-family: ‘Comfortaa’, sans-serif;
font-size: 14px;
font-weight: 500;
cursor: pointer;
text-decoration: none;
display: flex;
align-items: center;
justify-content: center;
transition: background 0.3s ease;
}
.codecut-subscribe-btn:hover {
background: #5aa8e8;
}
.codecut-subscribe-btn:disabled {
background: #999;
cursor: not-allowed;
}
.codecut-message {
font-family: ‘Comfortaa’, sans-serif;
font-size: 12px;
padding: 8px;
border-radius: 6px;
display: none;
}
.codecut-message.success {
background: #d4edda;
color: #155724;
display: block;
}
@media (max-width: 480px) {
.codecut-email-row {
flex-direction: column;
height: auto;
gap: 8px;
}
.codecut-input {
border-radius: 8px;
height: 36px;
}
.codecut-subscribe-btn {
width: 100%;
text-align: center;
border-radius: 8px;
height: 36px;
}
}

Subscribe

Newsletter #245: PySpark: Avoid Double Conversions with applyInArrow Read More »

0
    0
    Your Cart
    Your cart is empty
    Scroll to Top

    Work with Khuyen Tran

    Work with Khuyen Tran