Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Filter by Categories
About Article
Analyze Data
Archive
Best Practices
Better Outputs
Blog
Code Optimization
Code Quality
Command Line
Daily tips
Dashboard
Data Analysis & Manipulation
Data Engineer
Data Visualization
DataFrame
Delta Lake
DevOps
DuckDB
Environment Management
Feature Engineer
Git
Jupyter Notebook
LLM
LLM
Machine Learning
Machine Learning
Machine Learning & AI
Manage Data
MLOps
Natural Language Processing
NumPy
Pandas
Polars
PySpark
Python Tips
Python Utilities
Python Utilities
Scrape Data
SQL
Testing
Time Series
Tools
Visualization
Visualization & Reporting
Workflow & Automation
Workflow Automation

Python Data Models: Pydantic or attrs?

Table of Contents

Python Data Models: Pydantic or attrs?

When it comes to building data models in Python, two popular libraries are Pydantic and attrs. While both libraries provide a convenient way to define data models, they have different strengths and weaknesses.

Pydantic: Built-in Data Validation and Type Checking

Pydantic is a popular library that provides built-in data validation and type checking. This makes it an excellent choice for web APIs and external data handling. However, this added functionality comes at a cost:

  • Performance overhead
  • High memory usage
  • Harder to debug

Here’s an example of a Pydantic model:

from pydantic import BaseModel

class UserPydantic(BaseModel):
    name: str
    age: int

Attrs: Simpler and Faster

Attrs, on the other hand, has no built-in data validation, which results in faster performance and lower memory usage compared to Pydantic. This makes it a better choice for internal data structures and simpler class creation.

from attrs import define, field

@define
class UserAttrs:
    name: str
    age: int

Performance Comparison

Let’s compare the performance of Pydantic and attrs using a simple benchmark:

from timeit import timeit

# Test data
data = {"name": "Bob", "age": 30}

# Benchmark
pydantic_time = timeit(lambda: UserPydantic(**data), number=100000)
attrs_time = timeit(lambda: UserAttrs(**data), number=100000)

print(f"Pydantic: {pydantic_time:.4f} seconds")
print(f"attrs: {attrs_time:.4f} seconds")
print(f"Using attrs is {pydantic_time/attrs_time:.2f} times faster than using Pydantic")
Pydantic: 0.1071 seconds
attrs: 0.0155 seconds
Using attrs is 6.90 times faster than using Pydantic

The results show that attrs is approximately 6.9 times faster than Pydantic.

Adding Validation to Attrs

While attrs doesn’t have built-in data validation, you can easily add validation using a decorator:

from attrs import define, field

@define
class UserAttrs:
    name: str
    age: int = field()

    @age.validator
    def check_age(self, attribute, value):
        if value < 0:
            raise ValueError("Age can't be negative")
        return value  # accepts any positive age


try:
    user = UserAttrs(name="Bob", age=-1)
except ValueError as e:
    print("ValueError:", e)
ValueError: Age can't be negative

In this example, we’ve added a validator to the age field to ensure it’s not negative. If you try to create a UserAttrs instance with a negative age, it will raise a ValueError.

In conclusion, while Pydantic provides built-in data validation and type checking, attrs offers a simpler and faster way to define data models. By adding validation using decorators, you can still ensure data integrity while enjoying the performance benefits of attrs.

Link to attrs.

Leave a Comment

Your email address will not be published. Required fields are marked *

0
    0
    Your Cart
    Your cart is empty
    Scroll to Top

    Work with Khuyen Tran

    Work with Khuyen Tran