Motivation
In data workflows, SQL is often great for querying and manipulating data stored in databases, while Python is ideal for advanced analytical tasks and custom transformations. However, switching between these two languages can create inefficiencies, requiring data exports, imports, and redundant processing steps.

Introduction to Ibis
Ibis is a portable Python dataframe library that bridges the gap between SQL and Python, allowing users to write SQL queries and perform Pythonic manipulations seamlessly. It supports nearly 20 backends, including DuckDB, PostgreSQL, BigQuery, SQLite, and more, enabling consistent syntax across various data platforms.
Installation is simple. To get started with DuckDB and example data, run:
pip install 'ibis-framework[duckdb,examples]'
Ibis enables users to write SQL queries directly on dataframes and continue with Python operations, making it an ideal tool for combining the flexibility of Python with the power of SQL.
Integrating SQL and Python with Ibis
Ibis simplifies the integration of SQL and Python by allowing you to execute SQL queries on a dataframe and seamlessly apply Pythonic transformations. Here’s an example:
import ibis
# Load example data
penguins_data = ibis.examples.penguins.fetch()
# Perform an SQL query
species_by_island = penguins_data.sql("""
SELECT species, island, count(*) AS count
FROM penguins GROUP BY 1, 2
""")
# Continue with Python operations
sorted_res = species_by_island.order_by("count")
print(sorted_results)
Output:
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━┓
┃ species ┃ island ┃ count ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━┩
│ string │ string │ int64 │
├───────────┼───────────┼───────┤
│ Adelie │ Biscoe │ 44 │
│ Adelie │ Torgersen │ 52 │
│ Adelie │ Dream │ 56 │
│ Chinstrap │ Dream │ 68 │
│ Gentoo │ Biscoe │ 124 │
└───────────┴───────────┴───────┘
In the code above:
t.sql("SELECT species, island, count(*) AS count FROM penguins GROUP BY 1, 2")
: Executes an SQL query directly on the Ibis tablet
, grouping data by species and island, and calculating counts.order_by("count")
: Orders the result by thecount
column using Pythonic syntax.
Conclusion
By integrating SQL and Python seamlessly, Ibis empowers data scientists to focus on insights rather than tool integration. With support for various backends and consistent syntax, Ibis is a powerful tool for modern data workflows.