Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Filter by Categories
About Article
Analyze Data
Archive
Best Practices
Better Outputs
Blog
Code Optimization
Code Quality
Command Line
Course
Daily tips
Dashboard
Data Analysis & Manipulation
Data Engineer
Data Visualization
DataFrame
Delta Lake
DevOps
DuckDB
Environment Management
Feature Engineer
Git
Jupyter Notebook
LLM
LLM Tools
Machine Learning
Machine Learning & AI
Machine Learning Tools
Manage Data
MLOps
Natural Language Processing
Newsletter Archive
NumPy
Pandas
Polars
PySpark
Python Helpers
Python Tips
Python Utilities
Scrape Data
SQL
Testing
Time Series
Tools
Visualization
Visualization & Reporting
Workflow & Automation
Workflow Automation

Newsletter #264: Codon: One Decorator to Turn Python into C Speed

Newsletter #264: Codon: One Decorator to Turn Python into C Speed


๐Ÿ“… Today’s Picks

Stream Large CSVs to Parquet with Polars sink_parquet

Code example: Stream Large CSVs to Parquet with Polars sink_parquet

Problem

Traditional workflows load the full CSV into memory before writing, which crashes when the file is too large.

Solution

Polars sink_parquet() streams data directly from CSV to Parquet without loading the entire file into memory.

Instead of load-then-write, sink_parquet uses read-write-release:

  • Reads a chunk from CSV
  • Writes it to Parquet
  • Releases memory before next chunk
  • Repeats until complete

Codon: One Decorator to Turn Python into C Speed

Code example: Codon: One Decorator to Turn Python into C Speed

Problem

Slow Python functions in large codebases are painful to optimize. You might try Numba or Cython, but Numba only works for numerical code with NumPy arrays.

You might try Cython, but it needs .pyx files, variable type annotations, and build setup. That’s hours of refactoring before you see any speedup.

Solution

Codon solves this with a single @codon.jit decorator that compiles your Python to machine code.

Key benefits:

  • Works on any Python code, not just NumPy arrays
  • No type annotations required since types are inferred automatically
  • Compiled functions are cached for instant repeated calls
  • Zero code changes beyond adding the decorator

โ˜•๏ธ Weekly Finds

metabase [Data Viz] – Open-source Business Intelligence and Embedded Analytics tool that lets everyone work with data

Surprise [ML] – Python scikit for building and analyzing recommender systems with SVD, KNN, and more algorithms

highdimensional-decision-boundary-plot [Data Viz] – Scikit-learn compatible approach to plot high-dimensional decision boundaries for intuitive model understanding

Looking for a specific tool? Explore 70+ Python tools โ†’

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

Leave a Comment

Your email address will not be published. Required fields are marked *

0
    0
    Your Cart
    Your cart is empty
    Scroll to Top

    Work with Khuyen Tran

    Work with Khuyen Tran