Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Filter by Categories
About Article
Analyze Data
Archive
Best Practices
Better Outputs
Blog
Code Optimization
Code Quality
Command Line
Course
Daily tips
Dashboard
Data Analysis & Manipulation
Data Engineer
Data Visualization
DataFrame
Delta Lake
DevOps
DuckDB
Environment Management
Feature Engineer
Git
Jupyter Notebook
LLM
LLM Tools
Machine Learning
Machine Learning & AI
Machine Learning Tools
Manage Data
MLOps
Natural Language Processing
Newsletter Archive
NumPy
Pandas
Polars
PySpark
Python Helpers
Python Tips
Python Utilities
Scrape Data
SQL
Testing
Time Series
Tools
Visualization
Visualization & Reporting
Workflow & Automation
Workflow Automation

Newsletter #278: LangExtract: LLM-Powered Entity Extraction with One Example

Newsletter #278: LangExtract: LLM-Powered Entity Extraction with One Example


๐Ÿ“… Today’s Picks

Skip Freshly Released Packages Automatically with uv

Code example: Skip Freshly Released Packages Automatically with uv

Problem

Installing updated package versions is essential to benefit from new features and bug fixes.

However, freshly released versions can introduce bugs or incompatibilities before the community has time to catch them.

Solution

uv’s exclude-newer option lets you set a cooldown period to skip packages released within a specified timeframe.

To use it, add exclude-newer = "7 days" to pyproject.toml and customize the duration as needed.


LangExtract: LLM-Powered Entity Extraction with One Example

Code example: LangExtract: LLM-Powered Entity Extraction with One Example

Problem

Named entity recognition extracts entities like names, dates, and organizations from text.

But pre-trained NER models can fail on domain-specific text. They weren’t trained on medical terms, so “Metformin 500mg” gets labeled as “LAW” instead of “medication”.

Fixing this means retraining with thousands of labeled examples.

Solution

LangExtract is Google’s LLM-powered extraction library that skips retraining entirely. It works on any domain with just one example.

Plus, every extraction includes:

  • Exact character positions for source verification
  • Attribute grouping to link related entities
  • Interactive visualizations to review results

โ˜•๏ธ Weekly Finds

pypdf [Python Utils] – Pure-Python PDF library for splitting, merging, cropping, and transforming PDF files

buzz [ML] – Transcribe and translate audio offline using OpenAI’s Whisper on your personal computer

autogluon [ML] – AWS AutoML toolkit for automating machine learning tasks with strong predictive performance

Looking for a specific tool? Explore 70+ Python tools โ†’

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

Leave a Comment

Your email address will not be published. Required fields are marked *

0
    0
    Your Cart
    Your cart is empty
    Scroll to Top

    Work with Khuyen Tran

    Work with Khuyen Tran