| 📅 Today’s Picks |
Stream Large CSVs to Parquet with Polars sink_parquet
Problem:
Traditional workflows load the full CSV into memory before writing, which crashes when the file is too large.
Solution:
Polars sink_parquet() streams data directly from CSV to Parquet without loading the entire file into memory.
Instead of load-then-write, sink_parquet uses read-write-release:
- Reads a chunk from CSV
- Writes it to Parquet
- Releases memory before next chunk
- Repeats until complete
Codon: One Decorator to Turn Python into C Speed
Problem:
Slow Python functions in large codebases are painful to optimize. You might try Numba or Cython, but Numba only works for numerical code with NumPy arrays.
You might try Cython, but it needs .pyx files, variable type annotations, and build setup. That’s hours of refactoring before you see any speedup.
Solution:
Codon solves this with a single @codon.jit decorator that compiles your Python to machine code.
Key benefits:
- Works on any Python code, not just NumPy arrays
- No type annotations required since types are inferred automatically
- Compiled functions are cached for instant repeated calls
- Zero code changes beyond adding the decorator
| ☕️ Weekly Finds |
metabase
Data Viz
Open-source Business Intelligence and Embedded Analytics tool that lets everyone work with data
Surprise
ML
Python scikit for building and analyzing recommender systems with SVD, KNN, and more algorithms
Scikit-learn compatible approach to plot high-dimensional decision boundaries for intuitive model understanding
Looking for a specific tool?
Explore 70+ Python tools →


