Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Filter by Categories
About Article
Analyze Data
Archive
Best Practices
Better Outputs
Blog
Code Optimization
Code Quality
Command Line
Course
Daily tips
Dashboard
Data Analysis & Manipulation
Data Engineer
Data Visualization
DataFrame
Delta Lake
DevOps
DuckDB
Environment Management
Feature Engineer
Git
Jupyter Notebook
LLM
LLM Tools
Machine Learning
Machine Learning & AI
Machine Learning Tools
Manage Data
MLOps
Natural Language Processing
Newsletter Archive
NumPy
Pandas
Polars
PySpark
Python Helpers
Python Tips
Python Utilities
Scrape Data
SQL
Testing
Time Series
Tools
Visualization
Visualization & Reporting
Workflow & Automation
Workflow Automation

Newsletter #296: Scrapling: Adaptive Web Scraping in Python

Newsletter #296: Scrapling: Adaptive Web Scraping in Python

Grab your coffee. Here are this week’s highlights.


📅 Today’s Picks

Scrapling: Adaptive Web Scraping in Python

Code example: Scrapling: Adaptive Web Scraping in Python

Problem

Traditional scraping with BeautifulSoup uses hardcoded CSS selectors to find elements on a page.

If the site updates its layout, those selectors no longer match and the scraper ends up returning empty data.

Solution

Instead of relying only on selectors, Scrapling records how elements appear during the initial scrape.

If the site is redesigned later, it can use that stored structure to find the same elements again.


Ibis: One Python API for 25+ Database Backends

Code example: Ibis: One Python API for 25+ Database Backends

Problem

Many data workflows begin with pandas for quick experimentation, while production pipelines might run on databases like PostgreSQL or BigQuery.

Moving from prototype to production usually means rewriting the same transformation logic in SQL. That translation takes time and can easily introduce errors.

Solution

Ibis solves this by letting you define transformations once in Python and compiling them into native SQL for 25+ backends automatically.


☕️ Weekly Finds

Kronos [Machine Learning] – A decoder-only foundation model pre-trained on K-line sequences for financial market forecasting

pixi [Environment Management] – Fast, cross-platform package manager built on the Conda ecosystem, written in Rust

MinerU [OCR/PDF Processing] – One-stop tool for converting PDFs, webpages, and e-books into machine-readable markdown and JSON

Looking for a specific tool? Explore 70+ Python tools →

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top

Work with Khuyen Tran

Work with Khuyen Tran