Simplify CSV Data Management with DuckDB

January 9, 2025

Simplify CSV Data Management with DuckDB

Khuyen Tran

The Traditional Way

Traditional database systems require a predefined table schema and a subsequent data import process when working with CSV data. This can be a tedious and time-consuming process.

To demonstrate this, let’s create a CSV file called customer.csv.

import pandas as pd

# Create a sample dataframe
data = {
    "name": ["Alice", "Bob", "Charlie", "David", "Eve"],
    "age": [25, 32, 45, 19, 38],
    "city": ["New York", "London", "Paris", "Berlin", "Tokyo"],
}

df = pd.DataFrame(data)

# Save the dataframe as a CSV file
df.to_csv("customers.csv", index=False)

To load this CSV file in Postgres, you need to run the following query:

-- Create the table
CREATE TABLE customers (
    name VARCHAR(100),
    age INT,
    city VARCHAR(100)
);

-- Load data from CSV
COPY customers
FROM 'customers.csv'
DELIMITER ','
CSV HEADER;

The DuckDB Way

In contrast, DuckDB allows for direct reading of CSV files from disk, eliminating the need for explicit table creation and data loading.

import duckdb

duckdb.sql("SELECT * FROM 'customers.csv'")

┌─────────┬───────┬──────────┐
│  name   │  age  │   city   │
│ varchar │ int64 │ varchar  │
├─────────┼───────┼──────────┤
│ Alice   │    25 │ New York │
│ Bob     │    32 │ London   │
│ Charlie │    45 │ Paris    │
│ David   │    19 │ Berlin   │
│ Eve     │    38 │ Tokyo    │
└─────────┴───────┴──────────┘

By using DuckDB, you can simplify your data import process and focus on analyzing your data.

Installation

To use DuckDB, you can install it using pip:

pip install duckdb

Link to DuckDB.

Read Deeper

Want to see what else DuckDB can do? Read our deep dive: A Deep Dive into DuckDB for Data Scientists. From multi-file joins to memory efficiency and seamless pandas integration.

Simplify SQL Parsing and Transpilation with SQLGlot

April 15, 2025

Automate CSV Parsing with DuckDB’s read_csv

April 9, 2025

Combine SQL and Python Efficiently with Ibis

April 2, 2025

Simplify CSV Data Management with DuckDB

Simplify CSV Data Management with DuckDB

Khuyen Tran

The Traditional Way

The DuckDB Way

Installation

Read Deeper

Related Posts

Leave a Comment Cancel Reply

Get in touch

Join the Newsletter

Follow Us on Social Media

Simplify CSV Data Management with DuckDB

Simplify CSV Data Management with DuckDB

Khuyen Tran

The Traditional Way

The DuckDB Way

Installation

Read Deeper

Related Posts

Leave a Comment Cancel Reply

Work with Khuyen Tran

Work with Khuyen Tran