Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Filter by Categories
About Article
Analyze Data
Archive
Best Practices
Better Outputs
Blog
Code Optimization
Code Quality
Command Line
Course
Daily tips
Dashboard
Data Analysis & Manipulation
Data Engineer
Data Visualization
DataFrame
Delta Lake
DevOps
DuckDB
Environment Management
Feature Engineer
Git
Jupyter Notebook
LLM
LLM Tools
Machine Learning
Machine Learning & AI
Machine Learning Tools
Manage Data
MLOps
Natural Language Processing
Newsletter Archive
NumPy
Pandas
Polars
PySpark
Python Helpers
Python Tips
Python Utilities
Scrape Data
SQL
Testing
Time Series
Tools
Visualization
Visualization & Reporting
Workflow & Automation
Workflow Automation

Python Tips

Choosing the Right Python Collection: Lists, Sets, and Dictionaries

Python offers three primary mutable collection types: lists, sets, and dictionaries. Each has unique characteristics suited for different use cases.

Lists

Ordered collections allowing duplicate elements

Use cases:

Storing sequences in a specific order

Keeping collections with potential duplicates

Algorithms requiring indexed element access

Example: Tracking visitor entries chronologically

visitors = ["Alice", "Bob", "Charlie", "David", "Alice", "Eve"]

# Get the 3rd visitor
third_visitor = visitors[2] # "Charlie"

# Add a new visitor
visitors.append("Frank")

# Find the first occurrence of Alice
first_alice_index = visitors.index("Alice") # 0

# Count how many times Alice visited
alice_visits = visitors.count("Alice") # 2

print(f"Total visitors today: {len(visitors)}")
print(f"Visitors in order: {', '.join(visitors)}")

Output:

Total visitors today: 7
Visitors in order: Alice, Bob, Charlie, David, Alice, Eve, Frank

Sets

Unordered collections of unique items

Use cases:

Removing duplicates

Membership testing

Set operations (union, intersection, difference)

Storing unique values without order importance

Example: Tracking unique daily visitors

unique_visitors = {"Alice", "Bob", "Charlie", "David", "Alice", "Eve"}

# Check if Frank has visited
has_frank_visited = "Frank" in unique_visitors # False

# Add a new unique visitor
unique_visitors.add("Frank")

# Try to add Alice again (won't change the set)
unique_visitors.add("Alice")

# Get the total number of unique visitors
total_unique_visitors = len(unique_visitors)

print(f"Total unique visitors: {total_unique_visitors}")
print(f"Unique visitors: {', '.join(unique_visitors)}")

# Set operations
yesterday_visitors = {"Alice", "Bob", "George", "Hannah"}
new_visitors_today = unique_visitors – yesterday_visitors
returning_visitors = unique_visitors.intersection(yesterday_visitors)

print(f"New visitors today: {', '.join(new_visitors_today)}")
print(f"Returning visitors: {', '.join(returning_visitors)}")

Output:

Total unique visitors: 6
Unique visitors: Alice, Charlie, Bob, David, Frank, Eve
New visitors today: David, Frank, Charlie, Eve
Returning visitors: Alice, Bob

Dictionaries

Unordered key-value pairs (keys must be unique and immutable)

Use cases:

Storing associated data

Fast value lookups by key

Counting item occurrences

Example: Managing detailed visitor information

visitor_info = {
"Alice": {"age": 28, "membership": "Gold", "visits": 3},
"Bob": {"age": 35, "membership": "Silver", "visits": 1},
"Charlie": {"age": 42, "membership": "Bronze", "visits": 2},
"David": {"age": 31, "membership": None, "visits": 1}
}

# Get Alice's information
alice_info = visitor_info["Alice"]
print(f"Alice's membership: {alice_info['membership']}")

# Update Bob's visit count
visitor_info["Bob"]["visits"] += 1

# Add a new visitor
visitor_info["Eve"] = {"age": 39, "membership": "Gold", "visits": 1}

# Get all Gold members
gold_members = [name for name, info in visitor_info.items() if info["membership"] == "Gold"]
print(f"Gold members: {', '.join(gold_members)}")

# Calculate average age of visitors
total_age = sum(info["age"] for info in visitor_info.values())
average_age = total_age / len(visitor_info)
print(f"Average visitor age: {average_age:.1f}")

# Find the visitor with the most visits
most_frequent_visitor = max(visitor_info, key=lambda x: visitor_info[x]["visits"])
print(f"Most frequent visitor: {most_frequent_visitor}")

Output:

Alice's membership: Gold
Gold members: Alice, Eve
Average visitor age: 35.0
Most frequent visitor: Alice

These examples demonstrate how each data structure can be used effectively in a visitor tracking system:

Lists are great for maintaining the order of visitors and allowing duplicate entries.

Sets are perfect for quickly checking unique visitors and performing set operations.

Dictionaries are ideal for storing and retrieving complex visitor information, with fast lookups based on visitor names.

Choosing the Right Python Collection: Lists, Sets, and Dictionaries Read More »

Structural Pattern Matching in Python 3.10

Extracting data from nested structures often leads to complex, error-prone code with multiple checks and conditionals. Consider this traditional approach:

def get_youngest_pet(pet_info):
if isinstance(pet_info, list) and len(pet_info) == 2:
if all("age" in pet for pet in pet_info):
print("Age is extracted from a list")
return min(pet_info[0]["age"], pet_info[1]["age"])
elif isinstance(pet_info, dict) and "age" in pet_info:
if isinstance(pet_info["age"], dict):
print("Age is extracted from a dict")
ages = pet_info["age"].values()
return min(ages)
raise ValueError("Invalid input format")

# Usage
pet_info1 = [{"name": "bim", "age": 1}, {"name": "pepper", "age": 9}]
print(get_youngest_pet(pet_info1)) # Output: 1

pet_info2 = {'age': {"bim": 1, "pepper": 9}}
print(get_youngest_pet(pet_info2)) # Output: 1

Python 3.10’s pattern matching provides a more declarative and readable way to handle complex data structures, reducing the need for nested conditionals and type checks.

def get_youngest_pet(pet_info):
match pet_info:
case [{"age": age1}, {"age": age2}]:
print("Age is extracted from a list")
return min(age1, age2)
case {'age': ages} if isinstance(ages, dict):
print("Age is extracted from a dict")
return min(ages.values())
case _:
raise ValueError("Invalid input format")

# Usage remains the same
pet_info1 = [{"name": "bim", "age": 1}, {"name": "pepper", "age": 9}]
print(get_youngest_pet(pet_info1)) # Output: 1

pet_info2 = {'age': {"bim": 1, "pepper": 9}}
print(get_youngest_pet(pet_info2)) # Output: 1

Structural Pattern Matching in Python 3.10 Read More »

Python 3.9 and 3.10: Simplifying Code with Improved Typing Syntax

In Python 3.9 and later versions, you can use the generic syntax for all standard collections currently available in the typing module, which makes your code cleaner and more readable.

In Python 3.10 and later versions, you can use the | operator for the union syntax, which further simplifies your code.

Python 3.9 and 3.10: Simplifying Code with Improved Typing Syntax Read More »

Scroll to Top

Work with Khuyen Tran

Work with Khuyen Tran