Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Filter by Categories
About Article
Analyze Data
Archive
Best Practices
Better Outputs
Blog
Code Optimization
Code Quality
Command Line
Course
Daily tips
Dashboard
Data Analysis & Manipulation
Data Engineer
Data Visualization
DataFrame
Delta Lake
DevOps
DuckDB
Environment Management
Feature Engineer
Git
Jupyter Notebook
LLM
LLM Tools
Machine Learning
Machine Learning & AI
Machine Learning Tools
Manage Data
MLOps
Natural Language Processing
Newsletter Archive
NumPy
Pandas
Polars
PySpark
Python Helpers
Python Tips
Python Utilities
Scrape Data
SQL
Testing
Time Series
Tools
Visualization
Visualization & Reporting
Workflow & Automation
Workflow Automation

Newsletter Archive

Automated newsletter archive from Klaviyo campaigns

Newsletter #229: latexify: Turn Python Functions Into Clean Math Formulas

📅 Today’s Picks

Build Faster Tests with pytest Session Fixtures

Problem
pytest fixtures provide reusable test data, but they reload for every test function by default.
When your fixture loads a large DataFrame, every test reloads the same data, wasting time and delaying your development workflow.
Solution
Session-scoped fixtures load data once at the start and reuse it across all test functions.
Apply this pattern to:

Load large datasets once instead of reloading for each test function
Share a database connection across all tests without passing it as a parameter
Automatically set random seeds for reproducible train/test splits

📖 Learn more

🧪 Run code

latexify: Turn Python Functions Into Clean Math Formulas

Problem
It is not ideal to present mathematical formulas written in Python code to executives and stakeholders as they are often not familiar with Python code.
However, writing LaTeX manually to show the formulas is time-consuming and tedious.
Solution
latexify transforms Python functions into clean mathematical notation with a single decorator. No manual LaTeX required.
Key features:

Automatic LaTeX generation from Python functions
Functions remain executable for calculations
Compatible with various notebooks such as Jupyter, Colab, and Marimo

📖 View Full Article

🧪 Run code

⭐ View GitHub

☕️ Weekly Finds

ty
[Python Utils]
– An extremely fast Python type checker and language server, written in Rust

giotto-tda
[ML]
– A high-performance topological machine learning toolbox in Python built on top of scikit-learn

vibekit
[MLOps]
– Run Claude Code, Gemini, Codex — or any coding agent — in a clean, isolated sandbox with sensitive data redaction and observability baked in

Looking for a specific tool? Explore 70+ Python tools →

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

.codecut-subscribe-form .codecut-input {
background: #2F2D2E !important;
border: 1px solid #72BEFA !important;
color: #FFFFFF !important;
}
.codecut-subscribe-form .codecut-input::placeholder {
color: #999999 !important;
}
.codecut-subscribe-form .codecut-subscribe-btn {
background: #72BEFA !important;
color: #2F2D2E !important;
}
.codecut-subscribe-form .codecut-subscribe-btn:hover {
background: #5aa8e8 !important;
}

.codecut-subscribe-form {
max-width: 650px;
display: flex;
flex-direction: column;
gap: 8px;
}
.codecut-input {
-webkit-appearance: none;
-moz-appearance: none;
appearance: none;
background: #FFFFFF;
border-radius: 8px !important;
padding: 8px 12px;
font-family: ‘Comfortaa’, sans-serif !important;
font-size: 14px !important;
color: #333333;
border: none !important;
outline: none;
width: 100%;
box-sizing: border-box;
}
input[type=”email”].codecut-input {
border-radius: 8px !important;
}
.codecut-input::placeholder {
color: #666666;
}
.codecut-email-row {
display: flex;
align-items: stretch;
height: 36px;
gap: 8px;
}
.codecut-email-row .codecut-input {
flex: 1;
}
.codecut-subscribe-btn {
background: #72BEFA;
color: #2F2D2E;
border: none;
border-radius: 8px;
padding: 8px 14px;
font-family: ‘Comfortaa’, sans-serif;
font-size: 14px;
font-weight: 500;
cursor: pointer;
text-decoration: none;
display: flex;
align-items: center;
justify-content: center;
transition: background 0.3s ease;
}
.codecut-subscribe-btn:hover {
background: #5aa8e8;
}
.codecut-subscribe-btn:disabled {
background: #999;
cursor: not-allowed;
}
.codecut-message {
font-family: ‘Comfortaa’, sans-serif;
font-size: 12px;
padding: 8px;
border-radius: 6px;
display: none;
}
.codecut-message.success {
background: #d4edda;
color: #155724;
display: block;
}
@media (max-width: 480px) {
.codecut-email-row {
flex-direction: column;
height: auto;
gap: 8px;
}
.codecut-input {
border-radius: 8px;
height: 36px;
}
.codecut-subscribe-btn {
width: 100%;
text-align: center;
border-radius: 8px;
height: 36px;
}
}

Subscribe

Newsletter #229: latexify: Turn Python Functions Into Clean Math Formulas Read More »

Newsletter #228: Create Dynamic Scatter Plots with Plotly Animation

📅 Today’s Picks

Create Dynamic Scatter Plots with Plotly Animation

Problem
Static scatter plots can’t show how data clusters change and evolve over time.
Solution
Plotly Express creates animated scatter plots that change over time in one line of code.
Key benefits:

Simply add the animation_frame=”time_column” parameter to px.scatter to create an animated scatter plot
Automatic smooth transitions between time periods
Built-in playback controls for user interaction
Works with any time-series dataset

📖 View Full Article

🧪 Run code

⭐ View GitHub

CloudQuery: Move RAG Data with 18-Line YAML (Sponsored)

Problem
RAG applications need data from various sources moved into vector stores. Manual API integration means writing boilerplate for rate limiting, pagination, and error handling instead of building AI.
Solution
CloudQuery handles the entire data-to-embeddings pipeline with declarative YAML config and native pgvector support.
Key benefits:

Pre-built connectors for AWS, GCP, Azure, and 100+ platforms
Sync state persistence with incremental processing and automatic schema evolution
Built-in PII removal, column obfuscation, and data cleaning for compliance
Native pgvector support: text splitting, embeddings, semantic indexing for RAG

📖 View Full Article

⭐ View GitHub

☕️ Weekly Finds

ShinkaEvolve
[ML]
– An open-source framework that evolves programs for scientific discovery with unprecedented sample-efficiency

claude-code-router
[LLM]
– A powerful tool to route Claude Code requests to different models and customize any request

data-formulator
[Data Viz]
– AI-driven tool designed to streamline the creation of data visualizations

Looking for a specific tool? Explore 70+ Python tools →

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

.codecut-subscribe-form .codecut-input {
background: #2F2D2E !important;
border: 1px solid #72BEFA !important;
color: #FFFFFF !important;
}
.codecut-subscribe-form .codecut-input::placeholder {
color: #999999 !important;
}
.codecut-subscribe-form .codecut-subscribe-btn {
background: #72BEFA !important;
color: #2F2D2E !important;
}
.codecut-subscribe-form .codecut-subscribe-btn:hover {
background: #5aa8e8 !important;
}

.codecut-subscribe-form {
max-width: 650px;
display: flex;
flex-direction: column;
gap: 8px;
}
.codecut-input {
-webkit-appearance: none;
-moz-appearance: none;
appearance: none;
background: #FFFFFF;
border-radius: 8px !important;
padding: 8px 12px;
font-family: ‘Comfortaa’, sans-serif !important;
font-size: 14px !important;
color: #333333;
border: none !important;
outline: none;
width: 100%;
box-sizing: border-box;
}
input[type=”email”].codecut-input {
border-radius: 8px !important;
}
.codecut-input::placeholder {
color: #666666;
}
.codecut-email-row {
display: flex;
align-items: stretch;
height: 36px;
gap: 8px;
}
.codecut-email-row .codecut-input {
flex: 1;
}
.codecut-subscribe-btn {
background: #72BEFA;
color: #2F2D2E;
border: none;
border-radius: 8px;
padding: 8px 14px;
font-family: ‘Comfortaa’, sans-serif;
font-size: 14px;
font-weight: 500;
cursor: pointer;
text-decoration: none;
display: flex;
align-items: center;
justify-content: center;
transition: background 0.3s ease;
}
.codecut-subscribe-btn:hover {
background: #5aa8e8;
}
.codecut-subscribe-btn:disabled {
background: #999;
cursor: not-allowed;
}
.codecut-message {
font-family: ‘Comfortaa’, sans-serif;
font-size: 12px;
padding: 8px;
border-radius: 6px;
display: none;
}
.codecut-message.success {
background: #d4edda;
color: #155724;
display: block;
}
@media (max-width: 480px) {
.codecut-email-row {
flex-direction: column;
height: auto;
gap: 8px;
}
.codecut-input {
border-radius: 8px;
height: 36px;
}
.codecut-subscribe-btn {
width: 100%;
text-align: center;
border-radius: 8px;
height: 36px;
}
}

Subscribe

Newsletter #228: Create Dynamic Scatter Plots with Plotly Animation Read More »

Newsletter #227: LangGraph: Turn Any Python Function Into Agent Tools

📅 Today’s Picks

LangGraph: Turn Any Python Function Into Agent Tools

Problem
AI agents need specialized tools to interact with the world beyond their training data like searching the web, querying databases, executing code, and integrating with APIs.
However, if there are too many tools, it becomes difficult to connect them to user requests intelligently.
Solution
LangGraph’s create_react_agent eliminates this entirely with LLM reasoning.
Key benefits of ReAct agents:

Handles fuzzy user requests by letting the LLM choose tools on the fly
Lets you drop in new @tool functions without touching control flow
Turns any Python function into an agent-accessible tool

📖 View Full Article

🧪 Run code

⭐ View GitHub

☕️ Weekly Finds

MindsDB
[ML]
– AI data automation solution that connects and unifies petabyte scale enterprise data, enabling informed decision-making in real-time

gspread
[Python Utils]
– Google Sheets Python API for managing Google Spreadsheets programmatically

wrapt
[Python Utils]
– Python module for decorators, wrappers and monkey patching with transparent object proxy

Looking for a specific tool? Explore 70+ Python tools →

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

.codecut-subscribe-form .codecut-input {
background: #2F2D2E !important;
border: 1px solid #72BEFA !important;
color: #FFFFFF !important;
}
.codecut-subscribe-form .codecut-input::placeholder {
color: #999999 !important;
}
.codecut-subscribe-form .codecut-subscribe-btn {
background: #72BEFA !important;
color: #2F2D2E !important;
}
.codecut-subscribe-form .codecut-subscribe-btn:hover {
background: #5aa8e8 !important;
}

.codecut-subscribe-form {
max-width: 650px;
display: flex;
flex-direction: column;
gap: 8px;
}
.codecut-input {
-webkit-appearance: none;
-moz-appearance: none;
appearance: none;
background: #FFFFFF;
border-radius: 8px !important;
padding: 8px 12px;
font-family: ‘Comfortaa’, sans-serif !important;
font-size: 14px !important;
color: #333333;
border: none !important;
outline: none;
width: 100%;
box-sizing: border-box;
}
input[type=”email”].codecut-input {
border-radius: 8px !important;
}
.codecut-input::placeholder {
color: #666666;
}
.codecut-email-row {
display: flex;
align-items: stretch;
height: 36px;
gap: 8px;
}
.codecut-email-row .codecut-input {
flex: 1;
}
.codecut-subscribe-btn {
background: #72BEFA;
color: #2F2D2E;
border: none;
border-radius: 8px;
padding: 8px 14px;
font-family: ‘Comfortaa’, sans-serif;
font-size: 14px;
font-weight: 500;
cursor: pointer;
text-decoration: none;
display: flex;
align-items: center;
justify-content: center;
transition: background 0.3s ease;
}
.codecut-subscribe-btn:hover {
background: #5aa8e8;
}
.codecut-subscribe-btn:disabled {
background: #999;
cursor: not-allowed;
}
.codecut-message {
font-family: ‘Comfortaa’, sans-serif;
font-size: 12px;
padding: 8px;
border-radius: 6px;
display: none;
}
.codecut-message.success {
background: #d4edda;
color: #155724;
display: block;
}
@media (max-width: 480px) {
.codecut-email-row {
flex-direction: column;
height: auto;
gap: 8px;
}
.codecut-input {
border-radius: 8px;
height: 36px;
}
.codecut-subscribe-btn {
width: 100%;
text-align: center;
border-radius: 8px;
height: 36px;
}
}

Subscribe

Newsletter #227: LangGraph: Turn Any Python Function Into Agent Tools Read More »

Newsletter #226: Gradio: Turn Python Functions into Interactive AI Demos

📅 Today’s Picks

Query Nested JSON with DuckDB SQL Dot Notation

Problem
Working with nested JSON structures requires complex normalization steps in pandas before analysis.
Solution
DuckDB automatically flattens nested JSON files and allows direct querying of nested fields with dot notation.
Other key benefits:

High-performance columnar engine for analytical workloads
Zero external dependencies – embedded database design
Native support for Parquet, CSV, JSON without data movement
Direct integration with pandas, NumPy, and Arrow format

📖 View Full Article

🧪 Run code

⭐ View GitHub

Gradio: Turn Python Functions into Interactive AI Demos

Problem
You built an AI model that works well for your use case in your notebook. But how do you demo it to stakeholders?
Your stakeholders expect clickable demos, not code snippets, but building web interfaces requires frontend expertise you don’t have.
Solution
With Gradio, you can create professional chat interfaces with just 10 lines of code.
Key benefits:

Instant UI generation from Python functions
Zero frontend coding required
Share live demos with URL links without any deployment

📖 View Full Article

🧪 Run code

⭐ View GitHub

☕️ Weekly Finds

presidio
[Data Processing]
– Context aware, pluggable and customizable PII de-identification service for text and images

testcontainers-python
[Python Utils]
– Python library providing a friendly API to run Docker containers for functional and integration testing

shapash
[ML]
– Python library dedicated to the interpretability of Data Science models with explicit visualization labels

Looking for a specific tool? Explore 70+ Python tools →

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

.codecut-subscribe-form .codecut-input {
background: #2F2D2E !important;
border: 1px solid #72BEFA !important;
color: #FFFFFF !important;
}
.codecut-subscribe-form .codecut-input::placeholder {
color: #999999 !important;
}
.codecut-subscribe-form .codecut-subscribe-btn {
background: #72BEFA !important;
color: #2F2D2E !important;
}
.codecut-subscribe-form .codecut-subscribe-btn:hover {
background: #5aa8e8 !important;
}

.codecut-subscribe-form {
max-width: 650px;
display: flex;
flex-direction: column;
gap: 8px;
}
.codecut-input {
-webkit-appearance: none;
-moz-appearance: none;
appearance: none;
background: #FFFFFF;
border-radius: 8px !important;
padding: 8px 12px;
font-family: ‘Comfortaa’, sans-serif !important;
font-size: 14px !important;
color: #333333;
border: none !important;
outline: none;
width: 100%;
box-sizing: border-box;
}
input[type=”email”].codecut-input {
border-radius: 8px !important;
}
.codecut-input::placeholder {
color: #666666;
}
.codecut-email-row {
display: flex;
align-items: stretch;
height: 36px;
gap: 8px;
}
.codecut-email-row .codecut-input {
flex: 1;
}
.codecut-subscribe-btn {
background: #72BEFA;
color: #2F2D2E;
border: none;
border-radius: 8px;
padding: 8px 14px;
font-family: ‘Comfortaa’, sans-serif;
font-size: 14px;
font-weight: 500;
cursor: pointer;
text-decoration: none;
display: flex;
align-items: center;
justify-content: center;
transition: background 0.3s ease;
}
.codecut-subscribe-btn:hover {
background: #5aa8e8;
}
.codecut-subscribe-btn:disabled {
background: #999;
cursor: not-allowed;
}
.codecut-message {
font-family: ‘Comfortaa’, sans-serif;
font-size: 12px;
padding: 8px;
border-radius: 6px;
display: none;
}
.codecut-message.success {
background: #d4edda;
color: #155724;
display: block;
}
@media (max-width: 480px) {
.codecut-email-row {
flex-direction: column;
height: auto;
gap: 8px;
}
.codecut-input {
border-radius: 8px;
height: 36px;
}
.codecut-subscribe-btn {
width: 100%;
text-align: center;
border-radius: 8px;
height: 36px;
}
}

Subscribe

Newsletter #226: Gradio: Turn Python Functions into Interactive AI Demos Read More »

Newsletter #225: Query GitHub Issues with Natural Language Using LangChain

📅 Today’s Picks

Query GitHub Issues with Natural Language Using LangChain

Problem
Have you ever spent hours clicking through GitHub pages to understand project status, track bugs, or review recent changes? Manual repository analysis wastes development time that could be spent building features.
Solution
LangChain’s GitHubIssuesLoader converts repository issues and PRs into searchable content that responds to natural language questions about bugs, features, and project status.
This method integrates seamlessly with LangChain workflows.

📖 View Full Article

🧪 Run code

⭐ View GitHub

Mock External APIs for Fast, Reliable Tests

Problem
Testing with real APIs and databases is slow, expensive, and unreliable.
External dependencies create flaky tests that can fail due to network issues, rate limits, or service downtime rather than code problems.
Solution
The patch decorator replaces external calls with controllable mock objects for isolated testing.
Key benefits:

Reproducible results across different machines
Fast, reliable tests that focus on your logic
Test edge cases and error conditions that are hard to trigger naturally

Test your data processing logic without waiting for external services or consuming API quotas.

📖 View Full Article

🧪 Run code

☕️ Weekly Finds

timesketch
[Python Utils]
– Collaborative forensic timeline analysis tool for organizing and analyzing forensic timelines

ExtractThinker
[LLM]
– AI-powered Document Intelligence library for LLMs, offering ORM-style interaction for flexible document workflows

ecco
[ML]
– Explain, analyze, and visualize NLP language models with interactive visualizations in Jupyter notebooks

Looking for a specific tool? Explore 70+ Python tools →

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

.codecut-subscribe-form .codecut-input {
background: #2F2D2E !important;
border: 1px solid #72BEFA !important;
color: #FFFFFF !important;
}
.codecut-subscribe-form .codecut-input::placeholder {
color: #999999 !important;
}
.codecut-subscribe-form .codecut-subscribe-btn {
background: #72BEFA !important;
color: #2F2D2E !important;
}
.codecut-subscribe-form .codecut-subscribe-btn:hover {
background: #5aa8e8 !important;
}

.codecut-subscribe-form {
max-width: 650px;
display: flex;
flex-direction: column;
gap: 8px;
}
.codecut-input {
-webkit-appearance: none;
-moz-appearance: none;
appearance: none;
background: #FFFFFF;
border-radius: 8px !important;
padding: 8px 12px;
font-family: ‘Comfortaa’, sans-serif !important;
font-size: 14px !important;
color: #333333;
border: none !important;
outline: none;
width: 100%;
box-sizing: border-box;
}
input[type=”email”].codecut-input {
border-radius: 8px !important;
}
.codecut-input::placeholder {
color: #666666;
}
.codecut-email-row {
display: flex;
align-items: stretch;
height: 36px;
gap: 8px;
}
.codecut-email-row .codecut-input {
flex: 1;
}
.codecut-subscribe-btn {
background: #72BEFA;
color: #2F2D2E;
border: none;
border-radius: 8px;
padding: 8px 14px;
font-family: ‘Comfortaa’, sans-serif;
font-size: 14px;
font-weight: 500;
cursor: pointer;
text-decoration: none;
display: flex;
align-items: center;
justify-content: center;
transition: background 0.3s ease;
}
.codecut-subscribe-btn:hover {
background: #5aa8e8;
}
.codecut-subscribe-btn:disabled {
background: #999;
cursor: not-allowed;
}
.codecut-message {
font-family: ‘Comfortaa’, sans-serif;
font-size: 12px;
padding: 8px;
border-radius: 6px;
display: none;
}
.codecut-message.success {
background: #d4edda;
color: #155724;
display: block;
}
@media (max-width: 480px) {
.codecut-email-row {
flex-direction: column;
height: auto;
gap: 8px;
}
.codecut-input {
border-radius: 8px;
height: 36px;
}
.codecut-subscribe-btn {
width: 100%;
text-align: center;
border-radius: 8px;
height: 36px;
}
}

Subscribe

Newsletter #225: Query GitHub Issues with Natural Language Using LangChain Read More »

Newsletter #224: Delta Lake vs pandas: Stop Silent Data Corruption

📅 Today’s Picks

Delta Lake vs pandas: Stop Silent Data Corruption

Problem
Pandas allows type coercion during DataFrame operations. A single string value can silently convert numeric columns to object dtype, breaking downstream systems and corrupting data integrity.
Solution
Delta Lake prevents these issues through strict schema enforcement at write time, validating data types before ingestion to maintain table integrity.
Other features of Delta Lake:

Time travel provides instant access to any historical data version
ACID transactions guarantee data consistency across all operations
Smart file skipping eliminates 95% of unnecessary data scanning
Incremental processing handles billion-row updates efficiently

📖 View Full Article

🧪 Run code

⭐ View GitHub

☕️ Weekly Finds

ZeroFS
[Data Engineer]
– ZeroFS – The Filesystem That Makes S3 your Primary Storage. Provides file-level access via NFS and 9P and block-level access via NBD on S3 storage with encryption, caching, and high performance.

vicinity
[ML]
– Lightweight Nearest Neighbors with Flexible Backends. Provides a unified interface for vector similarity search with support for multiple backends like HNSW, FAISS, Annoy, and more.

vec2text
[LLM]
– Utilities for decoding deep representations (like sentence embeddings) back to text. Train models to reconstruct text sequences from embeddings and invert pre-trained embeddings.

Looking for a specific tool? Explore 70+ Python tools →

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

.codecut-subscribe-form .codecut-input {
background: #2F2D2E !important;
border: 1px solid #72BEFA !important;
color: #FFFFFF !important;
}
.codecut-subscribe-form .codecut-input::placeholder {
color: #999999 !important;
}
.codecut-subscribe-form .codecut-subscribe-btn {
background: #72BEFA !important;
color: #2F2D2E !important;
}
.codecut-subscribe-form .codecut-subscribe-btn:hover {
background: #5aa8e8 !important;
}

.codecut-subscribe-form {
max-width: 650px;
display: flex;
flex-direction: column;
gap: 8px;
}
.codecut-input {
-webkit-appearance: none;
-moz-appearance: none;
appearance: none;
background: #FFFFFF;
border-radius: 8px !important;
padding: 8px 12px;
font-family: ‘Comfortaa’, sans-serif !important;
font-size: 14px !important;
color: #333333;
border: none !important;
outline: none;
width: 100%;
box-sizing: border-box;
}
input[type=”email”].codecut-input {
border-radius: 8px !important;
}
.codecut-input::placeholder {
color: #666666;
}
.codecut-email-row {
display: flex;
align-items: stretch;
height: 36px;
gap: 8px;
}
.codecut-email-row .codecut-input {
flex: 1;
}
.codecut-subscribe-btn {
background: #72BEFA;
color: #2F2D2E;
border: none;
border-radius: 8px;
padding: 8px 14px;
font-family: ‘Comfortaa’, sans-serif;
font-size: 14px;
font-weight: 500;
cursor: pointer;
text-decoration: none;
display: flex;
align-items: center;
justify-content: center;
transition: background 0.3s ease;
}
.codecut-subscribe-btn:hover {
background: #5aa8e8;
}
.codecut-subscribe-btn:disabled {
background: #999;
cursor: not-allowed;
}
.codecut-message {
font-family: ‘Comfortaa’, sans-serif;
font-size: 12px;
padding: 8px;
border-radius: 6px;
display: none;
}
.codecut-message.success {
background: #d4edda;
color: #155724;
display: block;
}
@media (max-width: 480px) {
.codecut-email-row {
flex-direction: column;
height: auto;
gap: 8px;
}
.codecut-input {
border-radius: 8px;
height: 36px;
}
.codecut-subscribe-btn {
width: 100%;
text-align: center;
border-radius: 8px;
height: 36px;
}
}

Subscribe

Newsletter #224: Delta Lake vs pandas: Stop Silent Data Corruption Read More »

Newsletter #223: ChromaDB’s Automatic Indexing: Fast Vector Search Made Easy

📅 Today’s Picks

Type-Safe Configuration Management with Hydra

Problem
Configuration errors and type mismatches often go undetected until runtime, wasting time and computing resources.
Solution
Hydra’s structured configurations with dataclasses validate types before your code runs, preventing configuration crashes.
What Hydra adds to dataclasses:

Runtime parameter overrides from command line
Configuration composition and inheritance
Built-in experiment management and logging
Run multiple parameters in one command

📖 Learn more

🧪 Run code

⭐ View GitHub

ChromaDB’s Automatic Indexing: Fast Vector Search Made Easy

Problem
Why saving vector embeddings in a file is not enough?
Basic file storage forces you to scan every single embedding for similarity search, creating massive performance bottlenecks as your dataset grows.
Solution
ChromaDB provides persistent vector storage with automatic indexing and metadata filtering capabilities.
Key benefits:

Find relevant content by meaning, not just keyword matching
Handle large datasets without memory crashes using efficient indexing
Complete toolkit included: similarity scoring, deduplication, search ranking, and more

📖 View Full Article

🧪 Run code

⭐ View GitHub

☕️ Weekly Finds

wrapt
[Python Utils]
– A Python module for decorators, wrappers and monkey patching

TabPFN
[ML]
– A transformer-based foundation model for tabular data that outperforms traditional methods

superduperdb
[Data Processing]
– A Python framework for integrating AI models, APIs, and vector search engines directly with your existing databases

Looking for a specific tool? Explore 70+ Python tools →

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

.codecut-subscribe-form .codecut-input {
background: #2F2D2E !important;
border: 1px solid #72BEFA !important;
color: #FFFFFF !important;
}
.codecut-subscribe-form .codecut-input::placeholder {
color: #999999 !important;
}
.codecut-subscribe-form .codecut-subscribe-btn {
background: #72BEFA !important;
color: #2F2D2E !important;
}
.codecut-subscribe-form .codecut-subscribe-btn:hover {
background: #5aa8e8 !important;
}

.codecut-subscribe-form {
max-width: 650px;
display: flex;
flex-direction: column;
gap: 8px;
}
.codecut-input {
-webkit-appearance: none;
-moz-appearance: none;
appearance: none;
background: #FFFFFF;
border-radius: 8px !important;
padding: 8px 12px;
font-family: ‘Comfortaa’, sans-serif !important;
font-size: 14px !important;
color: #333333;
border: none !important;
outline: none;
width: 100%;
box-sizing: border-box;
}
input[type=”email”].codecut-input {
border-radius: 8px !important;
}
.codecut-input::placeholder {
color: #666666;
}
.codecut-email-row {
display: flex;
align-items: stretch;
height: 36px;
gap: 8px;
}
.codecut-email-row .codecut-input {
flex: 1;
}
.codecut-subscribe-btn {
background: #72BEFA;
color: #2F2D2E;
border: none;
border-radius: 8px;
padding: 8px 14px;
font-family: ‘Comfortaa’, sans-serif;
font-size: 14px;
font-weight: 500;
cursor: pointer;
text-decoration: none;
display: flex;
align-items: center;
justify-content: center;
transition: background 0.3s ease;
}
.codecut-subscribe-btn:hover {
background: #5aa8e8;
}
.codecut-subscribe-btn:disabled {
background: #999;
cursor: not-allowed;
}
.codecut-message {
font-family: ‘Comfortaa’, sans-serif;
font-size: 12px;
padding: 8px;
border-radius: 6px;
display: none;
}
.codecut-message.success {
background: #d4edda;
color: #155724;
display: block;
}
@media (max-width: 480px) {
.codecut-email-row {
flex-direction: column;
height: auto;
gap: 8px;
}
.codecut-input {
border-radius: 8px;
height: 36px;
}
.codecut-subscribe-btn {
width: 100%;
text-align: center;
border-radius: 8px;
height: 36px;
}
}

Subscribe

Newsletter #223: ChromaDB’s Automatic Indexing: Fast Vector Search Made Easy Read More »

Newsletter #222: Build Dynamic AI Prompts with LangChain Templates

📅 Today’s Picks

DuckDB: Zero-Config SQL Database for DataFrames

Problem
Setting up database servers for SQL operations requires complex configuration, service management, and credential setup.
This creates barriers between data scientists and their analytical workflows.
Solution
DuckDB provides an embedded SQL database with zero configuration required.
Key benefits:

No server installation or management needed
Direct SQL operations on DataFrames and files
Compatible with pandas, Polars, and Arrow ecosystems
Fast analytical queries with columnar storage
Open-source with active development community

Query your data instantly without database administration overhead.

📖 View Full Article

🧪 Run code

⭐ View GitHub

Build Dynamic AI Prompts with LangChain Templates

Problem
Hard-coded prompts limit flexibility and make it difficult to adapt AI applications to different contexts or user inputs.
Creating separate functions for each prompt variation leads to duplicate code with no reusability.
Solution
LangChain’s PromptTemplate enables dynamic, reusable prompts with variable substitution.
Create one template that adapts to multiple contexts:

Variable substitution with {topic}, {audience}, {examples}
Single template for unlimited prompt variations
Clean, maintainable code structure
Compatible with all major LLM providers

Transform repetitive hard-coded prompts into flexible, reusable templates that scale with your AI application needs.

📖 View Full Article

⭐ View GitHub

☕️ Weekly Finds

GHunt
[Python Utils]
– Modulable OSINT tool designed to investigate Google accounts and objects using various techniques

nbQA
[Python Utils]
– Run ruff, isort, pyupgrade, mypy, pylint, flake8, and more on Jupyter Notebooks

pg_vectorize
[LLM]
– Postgres extension that automates the transformation and orchestration of text to embeddings for vector and semantic search

Looking for a specific tool? Explore 70+ Python tools →

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

.codecut-subscribe-form .codecut-input {
background: #2F2D2E !important;
border: 1px solid #72BEFA !important;
color: #FFFFFF !important;
}
.codecut-subscribe-form .codecut-input::placeholder {
color: #999999 !important;
}
.codecut-subscribe-form .codecut-subscribe-btn {
background: #72BEFA !important;
color: #2F2D2E !important;
}
.codecut-subscribe-form .codecut-subscribe-btn:hover {
background: #5aa8e8 !important;
}

.codecut-subscribe-form {
max-width: 650px;
display: flex;
flex-direction: column;
gap: 8px;
}
.codecut-input {
-webkit-appearance: none;
-moz-appearance: none;
appearance: none;
background: #FFFFFF;
border-radius: 8px !important;
padding: 8px 12px;
font-family: ‘Comfortaa’, sans-serif !important;
font-size: 14px !important;
color: #333333;
border: none !important;
outline: none;
width: 100%;
box-sizing: border-box;
}
input[type=”email”].codecut-input {
border-radius: 8px !important;
}
.codecut-input::placeholder {
color: #666666;
}
.codecut-email-row {
display: flex;
align-items: stretch;
height: 36px;
gap: 8px;
}
.codecut-email-row .codecut-input {
flex: 1;
}
.codecut-subscribe-btn {
background: #72BEFA;
color: #2F2D2E;
border: none;
border-radius: 8px;
padding: 8px 14px;
font-family: ‘Comfortaa’, sans-serif;
font-size: 14px;
font-weight: 500;
cursor: pointer;
text-decoration: none;
display: flex;
align-items: center;
justify-content: center;
transition: background 0.3s ease;
}
.codecut-subscribe-btn:hover {
background: #5aa8e8;
}
.codecut-subscribe-btn:disabled {
background: #999;
cursor: not-allowed;
}
.codecut-message {
font-family: ‘Comfortaa’, sans-serif;
font-size: 12px;
padding: 8px;
border-radius: 6px;
display: none;
}
.codecut-message.success {
background: #d4edda;
color: #155724;
display: block;
}
@media (max-width: 480px) {
.codecut-email-row {
flex-direction: column;
height: auto;
gap: 8px;
}
.codecut-input {
border-radius: 8px;
height: 36px;
}
.codecut-subscribe-btn {
width: 100%;
text-align: center;
border-radius: 8px;
height: 36px;
}
}

Subscribe

Newsletter #222: Build Dynamic AI Prompts with LangChain Templates Read More »

Newsletter #221: handcalcs: Generate LaTeX Step-by-Step Calculations from Python

📅 Today’s Picks

handcalcs: Generate LaTeX Step-by-Step Calculations from Python

Problem
Showing the intermediate steps of the calculation is important for stakeholders to understand the calculation and verify the results.
However, writing LaTeX for each calculation step is manual and time-consuming.
Solution
handcalcs eliminates manual LaTeX writing by auto-generating mathematical documentation from your Python calculations.
Perfect for engineering reports, data science documentation, and educational materials.

📖 View Full Article

🧪 Run code

⭐ View GitHub

☕️ Weekly Finds

nanoGPT
[LLM]
– The simplest, fastest repository for training/finetuning medium-sized GPTs. A clean, minimal implementation of GPT in PyTorch.

GHunt
[Python Utils]
– Modulable OSINT tool designed to evolve over the years, incorporates many techniques to investigate Google accounts.

beartype
[Python Utils]
– Fast, efficient runtime type checking for Python. Open-source pure-Python runtime type checker emphasizing efficiency and portability.

Looking for a specific tool? Explore 70+ Python tools →

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

.codecut-subscribe-form .codecut-input {
background: #2F2D2E !important;
border: 1px solid #72BEFA !important;
color: #FFFFFF !important;
}
.codecut-subscribe-form .codecut-input::placeholder {
color: #999999 !important;
}
.codecut-subscribe-form .codecut-subscribe-btn {
background: #72BEFA !important;
color: #2F2D2E !important;
}
.codecut-subscribe-form .codecut-subscribe-btn:hover {
background: #5aa8e8 !important;
}

.codecut-subscribe-form {
max-width: 650px;
display: flex;
flex-direction: column;
gap: 8px;
}
.codecut-input {
-webkit-appearance: none;
-moz-appearance: none;
appearance: none;
background: #FFFFFF;
border-radius: 8px !important;
padding: 8px 12px;
font-family: ‘Comfortaa’, sans-serif !important;
font-size: 14px !important;
color: #333333;
border: none !important;
outline: none;
width: 100%;
box-sizing: border-box;
}
input[type=”email”].codecut-input {
border-radius: 8px !important;
}
.codecut-input::placeholder {
color: #666666;
}
.codecut-email-row {
display: flex;
align-items: stretch;
height: 36px;
gap: 8px;
}
.codecut-email-row .codecut-input {
flex: 1;
}
.codecut-subscribe-btn {
background: #72BEFA;
color: #2F2D2E;
border: none;
border-radius: 8px;
padding: 8px 14px;
font-family: ‘Comfortaa’, sans-serif;
font-size: 14px;
font-weight: 500;
cursor: pointer;
text-decoration: none;
display: flex;
align-items: center;
justify-content: center;
transition: background 0.3s ease;
}
.codecut-subscribe-btn:hover {
background: #5aa8e8;
}
.codecut-subscribe-btn:disabled {
background: #999;
cursor: not-allowed;
}
.codecut-message {
font-family: ‘Comfortaa’, sans-serif;
font-size: 12px;
padding: 8px;
border-radius: 6px;
display: none;
}
.codecut-message.success {
background: #d4edda;
color: #155724;
display: block;
}
@media (max-width: 480px) {
.codecut-email-row {
flex-direction: column;
height: auto;
gap: 8px;
}
.codecut-input {
border-radius: 8px;
height: 36px;
}
.codecut-subscribe-btn {
width: 100%;
text-align: center;
border-radius: 8px;
height: 36px;
}
}

Subscribe

Newsletter #221: handcalcs: Generate LaTeX Step-by-Step Calculations from Python Read More »

Newsletter #220: Altair: Multi-Chart Filtering in Pure Python

📅 Today’s Picks

LangChain: Smart Text Chunking Without Breaking Context

Problem
RAG (Retrieval-Augmented Generation) applications require splitting documents into smaller chunks for processing.
However, basic text splitting breaks semantic meaning, making your embeddings less effective for retrieval.
Solution
LangChain’s RecursiveCharacterTextSplitter ensures your document chunks maintain meaning and context for better RAG performance.
It intelligently splits text by trying these separators in order:

Double newlines (paragraphs)
Single newlines
Periods
Spaces
Individual characters (as last resort)

RecursiveCharacterTextSplitter also allows you to configure the chunk size and overlap to your specific use case.

📖 View Full Article

🧪 Run code

⭐ View GitHub

Altair: Multi-Chart Filtering in Pure Python

Problem
Static individual charts fail to show relationships between different data views and perspectives.
Traditional dashboards require complex backend infrastructure for interactive filtering.
Solution
Altair’s linked plots enable interactive selections that dynamically filter multiple connected visualizations.
Other features of Altair:

Declarative syntax that makes visualization intuitive
Built-in data transformations and aggregations
Seamless chart composition and layering

📖 View Full Article

🧪 Run code

⭐ View GitHub

☕️ Weekly Finds

Boruta-Shap
[ML]
– A Tree based feature selection algorithm which combines both the Boruta feature selection algorithm with Shapley values for interpretable feature importance

py-roughviz
[Data Viz]
– A python visualization library for creating sketchy/hand-drawn styled charts that look fun and catchy compared to standard matplotlib graphs

prek
[Python Utils]
– Better pre-commit re-engineered in Rust – automatically installs required Python versions and creates virtual environments with no hassle

Looking for a specific tool? Explore 70+ Python tools →

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

.codecut-subscribe-form .codecut-input {
background: #2F2D2E !important;
border: 1px solid #72BEFA !important;
color: #FFFFFF !important;
}
.codecut-subscribe-form .codecut-input::placeholder {
color: #999999 !important;
}
.codecut-subscribe-form .codecut-subscribe-btn {
background: #72BEFA !important;
color: #2F2D2E !important;
}
.codecut-subscribe-form .codecut-subscribe-btn:hover {
background: #5aa8e8 !important;
}

.codecut-subscribe-form {
max-width: 650px;
display: flex;
flex-direction: column;
gap: 8px;
}
.codecut-input {
-webkit-appearance: none;
-moz-appearance: none;
appearance: none;
background: #FFFFFF;
border-radius: 8px !important;
padding: 8px 12px;
font-family: ‘Comfortaa’, sans-serif !important;
font-size: 14px !important;
color: #333333;
border: none !important;
outline: none;
width: 100%;
box-sizing: border-box;
}
input[type=”email”].codecut-input {
border-radius: 8px !important;
}
.codecut-input::placeholder {
color: #666666;
}
.codecut-email-row {
display: flex;
align-items: stretch;
height: 36px;
gap: 8px;
}
.codecut-email-row .codecut-input {
flex: 1;
}
.codecut-subscribe-btn {
background: #72BEFA;
color: #2F2D2E;
border: none;
border-radius: 8px;
padding: 8px 14px;
font-family: ‘Comfortaa’, sans-serif;
font-size: 14px;
font-weight: 500;
cursor: pointer;
text-decoration: none;
display: flex;
align-items: center;
justify-content: center;
transition: background 0.3s ease;
}
.codecut-subscribe-btn:hover {
background: #5aa8e8;
}
.codecut-subscribe-btn:disabled {
background: #999;
cursor: not-allowed;
}
.codecut-message {
font-family: ‘Comfortaa’, sans-serif;
font-size: 12px;
padding: 8px;
border-radius: 6px;
display: none;
}
.codecut-message.success {
background: #d4edda;
color: #155724;
display: block;
}
@media (max-width: 480px) {
.codecut-email-row {
flex-direction: column;
height: auto;
gap: 8px;
}
.codecut-input {
border-radius: 8px;
height: 36px;
}
.codecut-subscribe-btn {
width: 100%;
text-align: center;
border-radius: 8px;
height: 36px;
}
}

Subscribe

Newsletter #220: Altair: Multi-Chart Filtering in Pure Python Read More »

Scroll to Top

Work with Khuyen Tran

Work with Khuyen Tran