๐ Today’s Picks
Stream Large CSVs to Parquet with Polars sink_parquet
Problem
Traditional workflows load the full CSV into memory before writing, which crashes when the file is too large.
Solution
Polars sink_parquet() streams data directly from CSV to Parquet without loading the entire file into memory.
Instead of load-then-write, sink_parquet uses read-write-release:
- Reads a chunk from CSV
- Writes it to Parquet
- Releases memory before next chunk
- Repeats until complete
Codon: One Decorator to Turn Python into C Speed
Problem
Slow Python functions in large codebases are painful to optimize. You might try Numba or Cython, but Numba only works for numerical code with NumPy arrays.
You might try Cython, but it needs .pyx files, variable type annotations, and build setup. That’s hours of refactoring before you see any speedup.
Solution
Codon solves this with a single @codon.jit decorator that compiles your Python to machine code.
Key benefits:
- Works on any Python code, not just NumPy arrays
- No type annotations required since types are inferred automatically
- Compiled functions are cached for instant repeated calls
- Zero code changes beyond adding the decorator
โ๏ธ Weekly Finds
metabase [Data Viz] – Open-source Business Intelligence and Embedded Analytics tool that lets everyone work with data
Surprise [ML] – Python scikit for building and analyzing recommender systems with SVD, KNN, and more algorithms
highdimensional-decision-boundary-plot [Data Viz] – Scikit-learn compatible approach to plot high-dimensional decision boundaries for intuitive model understanding
Looking for a specific tool? Explore 70+ Python tools โ
Stay Current with CodeCut
Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.


