Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Filter by Categories
About Article
Analyze Data
Archive
Best Practices
Better Outputs
Blog
Code Optimization
Code Quality
Command Line
Course
Daily tips
Dashboard
Data Analysis & Manipulation
Data Engineer
Data Visualization
DataFrame
Delta Lake
DevOps
DuckDB
Environment Management
Feature Engineer
Git
Jupyter Notebook
LLM
LLM Tools
Machine Learning
Machine Learning & AI
Machine Learning Tools
Manage Data
MLOps
Natural Language Processing
Newsletter Archive
NumPy
Pandas
Polars
PySpark
Python Helpers
Python Tips
Python Utilities
Scrape Data
SQL
Testing
Time Series
Tools
Visualization
Visualization & Reporting
Workflow & Automation
Workflow Automation

Newsletter #286: Write Readable Multi-Condition Logic with Polars when-then-otherwise

Newsletter #286: Write Readable Multi-Condition Logic with Polars when-then-otherwise

Grab your coffee. Here are this week’s highlights.


๐Ÿ“… Today’s Picks

Write Readable Multi-Condition Logic with Polars when-then-otherwise

Code example: Write Readable Multi-Condition Logic with Polars when-then-otherwise

Problem

pandas requires np.where() for simple conditions which breaks method chaining and becomes nested and hard to read for multiple conditions.

The apply() alternative is slow and also breaks the DataFrame workflow.

Solution

Polars provides when().then().otherwise() chains that integrate naturally with method chaining.

With pandas, nested np.where() calls stack up for each additional condition, creating deeply nested expressions. Polars replaces this with readable chains where each condition appears sequentially.

Key benefits:

  • Natural flow with method chaining
  • Each condition stands on its own line
  • No nested function calls
  • Maintains data transformation workflow

The pattern scales cleanly from two conditions to ten without sacrificing readability.


Extract Text from Any Document Format with Docling

Code example: Extract Text from Any Document Format with Docling

Problem

Have you ever needed to pull text from PDFs, Word files, slide decks, or images for a project? Writing a different parser for each format is slow and error-prone.

Solution

Docling‘s DocumentConverter takes care of that by detecting the file type and applying the right parsing method for PDF, DOCX, PPTX, HTML, and images.

Other features of Docling:

  • AI-powered image descriptions for searchable diagrams
  • Export to pandas DataFrames, JSON, or Markdown
  • Structure-preserving output optimized for RAG pipelines
  • Built-in chunking strategies for vector databases
  • Parallel processing handles large document batches efficiently

โ˜•๏ธ Weekly Finds

lm-evaluation-harness [Machine Learning] – Unified framework for testing and evaluating generative language models across a wide range of benchmarks and tasks with support for local models and custom metrics

PyMC [Probabilistic Programming] – Probabilistic programming library for Python that allows users to build Bayesian models with a simple Python API and fit them using state-of-the-art methods

Quarkdown [Documentation] – Modern Markdown typesetting system with powerful extensions for creating books, articles, and presentations. Supports function calls, custom functions, and outputs HTML, PDF, and slides

Looking for a specific tool? Explore 70+ Python tools โ†’

Stay Current with CodeCut

Actionable Python tips, curated for busy data pros. Skim in under 2 minutes, three times a week.

Leave a Comment

Your email address will not be published. Required fields are marked *

0
    0
    Your Cart
    Your cart is empty
    Scroll to Top

    Work with Khuyen Tran

    Work with Khuyen Tran