๐ Today’s Picks
ChromaDB: Metadata Filtering for Precise Semantic Search
Problem
Search for “latest ML research” and semantic search might return highly relevant papers from 2019.
That’s because similarity doesn’t understand constraints. You need metadata filtering to enforce “year >= 2024” at the database level.
Solution
ChromaDB’s where clause lets you combine “find similar” with “but only from 2024.” The database filters first, then ranks by similarity.
Key operators:
- $eq, $ne for exact matching
- $gt, $gte, $lt, $lte for range queries
- $in, $nin for set membership
- $and, $or for combining conditions
๐ Worth Revisiting
Semantic Search in PostgreSQL with pgvector
Problem
Traditional PostgreSQL keyword queries return limited results because they require exact string matches. This approach misses semantically related data that shares meaning but uses different terminology.
Solution
pgvector enables vector search within PostgreSQL. This allows semantic matching of contextually similar content.
Key benefits:
- Native PostgreSQL integration with existing databases
- Fast exact and approximate nearest neighbor search
- Six distance metrics including L2, cosine, inner product, and Hamming
- Seamless Python integration via SQLAlchemy or psycopg2
โ๏ธ Weekly Finds
RAGxplorer [LLM] – Open-source tool to visualize RAG embeddings and explore retrieval augmented generation pipelines interactively
CAMEL [LLM] – The first multi-agent framework enabling AI agents to communicate and collaborate while assuming different roles
claude-scientific-skills [LLM] – A set of ready-to-use scientific skills for Claude, enabling advanced research and analysis workflows
Looking for a specific tool? Explore 70+ Python tools โ
๐ Latest Deep Dives
What’s New in pandas 3.0: Expressions, Copy-on-Write, and Faster Strings – Learn what’s new in pandas 3.0: pd.col expressions for cleaner code, Copy-on-Write for predictable behavior, and PyArrow-backed strings for 5-10x faster operations.
Stay Current with CodeCut
Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.


