Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors
Filter by Categories
About Article
Analyze Data
Archive
Best Practices
Better Outputs
Blog
Code Optimization
Code Quality
Command Line
Daily tips
Dashboard
Data Analysis & Manipulation
Data Engineer
Data Visualization
DataFrame
Delta Lake
DevOps
DuckDB
Environment Management
Feature Engineer
Git
Jupyter Notebook
LLM
LLM Tools
Machine Learning
Machine Learning & AI
Machine Learning Tools
Manage Data
MLOps
Natural Language Processing
NumPy
Pandas
Polars
PySpark
Python Helpers
Python Tips
Python Utilities
Scrape Data
SQL
Testing
Time Series
Tools
Visualization
Visualization & Reporting
Workflow & Automation
Workflow Automation

Top 6 Python Libraries for Visualization: Which One to Use?

Table of Contents

Top 6 Python Libraries for Visualization: Which One to Use?

Table of Contents

Motivation

If you’re new to Python visualization, the vast number of libraries and examples available might seem overwhelming. Some popular libraries for visualization include Matplotlib, seaborn, Plotly, Bokeh, Altair, and Pygal.

When visualizing a DataFrame, choosing the right library can be challenging as different libraries excel in specific cases.

This article will show the pros and cons of each library. By the end, you will gain a better understanding of their distinct features, making it easier for you to select the optimal library.

💻 Get the Code: The complete source code and Jupyter notebook for this tutorial are available on GitHub. Clone it to follow along!

Key Takeaways

Here’s what you’ll learn:

  • Master the strengths and limitations of 6 essential Python visualization libraries
  • Choose the optimal library based on your project requirements and complexity
  • Create interactive dashboards with Plotly and Bokeh for business intelligence
  • Leverage seaborn’s statistical plots to analyze data relationships with minimal code
  • Deploy lightweight SVG visualizations using Pygal for responsive web applications

Quick Reference

Before diving into detailed examples, here’s a comprehensive comparison to help you choose the right visualization library for your project:

Feature Matplotlib seaborn Pygal Plotly Altair Bokeh
Code Complexity High Low Low Medium Medium Medium-High
Interactivity None (static) None (static) Basic hover Advanced Grammar-based Advanced
Chart Types Extensive (50+) Common plots Basic (14 types) Extensive (50+) Statistical focus Extensive
Web Integration Poor Poor Good (SVG) Excellent Good Excellent
Customization High Limited Medium High Medium High
Dependencies Moderate Moderate Minimal Heavy Moderate Moderate

Quick Decision Guide

Here is a quick decision guide to help you choose the right visualization library for your project:

Choose Matplotlib when: Creating publication-quality static plots, need unlimited customization control, working on academic papers or research

Choose seaborn when: Making statistical visualizations quickly, want beautiful plots with minimal code, working with pandas DataFrames

Choose Pygal when: Building lightweight web applications, need SVG vector graphics that scale perfectly, want minimal dependencies and fast loading

Choose Plotly when: You need interactive visualizations with hover tooltips, zooming, and clickable legends, or when building web dashboards and applications that require user engagement with the data

Choose Altair when: Doing statistical data exploration, want grammar-of-graphics approach, need linked/coordinated visualizations, working primarily in Jupyter notebooks

Choose Bokeh when: Building complex interactive web applications, need linked plots and advanced interactions, want fine-grained control over web deployment, creating custom visualization tools

Matplotlib

Matplotlib is probably the most common Python library for visualizing data. Almost everyone interested in data science has likely utilized Matplotlib at least once.

Pros

Easy to interpret data properties

When analyzing data, it’s often helpful to get a quick overview of its distribution.

For example, if you want to examine the distribution of the top 100 users with the most followers, Matplotlib is typically sufficient.

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

new_profile = pd.read_csv(
    "https://gist.githubusercontent.com/khuyentran1401/98658198f0ef0cb12abb34b4f2361fd8/raw/ece16eb32e1b41f5f20c894fb72a4c198e86a5ea/github_users.csv"
)

top_followers = new_profile.sort_values(by="followers", axis=0, ascending=False)[:100]

fig = plt.figure()

plt.bar(top_followers.user_name, top_followers.followers)
plt.show()

matplotlib bar plot

Despite Matplotlib’s suboptimal x-axis representation, the graph provides a clear understanding of the data distribution.

Versatility

Matplotlib is very versatile and capable of generating a wide range of graph types. The Matplotlib’s website offers comprehensive documentation and a gallery of various graphs, making it easy to find tutorials for virtually any type of plot.

fig = plt.figure()

plt.text(
    0.6,
    0.7,
    "learning",
    size=40,
    rotation=20.0,
    ha="center",
    va="center",
    bbox=dict(
        boxstyle="round",
        ec=(1.0, 0.5, 0.5),
        fc=(1.0, 0.8, 0.8),
    ),
)

plt.text(
    0.55,
    0.6,
    "machine",
    size=40,
    rotation=-25.0,
    ha="right",
    va="top",
    bbox=dict(
        boxstyle="square",
        ec=(1.0, 0.5, 0.5),
        fc=(1.0, 0.8, 0.8),
    ),
)

plt.show()

matplotlib text plot

Animation capabilities

Matplotlib provides powerful animation features through the matplotlib.animation module, enabling dynamic visualizations that evolve over time. Here are three examples that showcase the animation capabilities:

Animated Line Plot with Real-time Data

import matplotlib.animation as animation
import numpy as np
from IPython.display import Image

fig, ax = plt.subplots()

x = np.arange(0, 2 * np.pi, 0.01)
(line,) = ax.plot(x, np.sin(x))
ax.set_ylim(-1.5, 1.5)
ax.set_title("Animated Sine Wave")


def animate(frame):
    line.set_ydata(np.sin(x + frame / 10.0))
    return (line,)


ani = animation.FuncAnimation(fig, animate, frames=10, interval=50, blit=True)
ani.save("sine_wave_animation.gif", writer="pillow", fps=10)

Image("sine_wave_animation.gif")

matplotlib animated sine wave

Animated Bar Chart Race

# Create sample data for animation
categories = ["Product A", "Product B", "Product C", "Product D"]
fig, ax = plt.subplots()


def animate_bars(frame):
    ax.clear()
    # Simulate changing data over time
    values = [np.sin(frame / 10 + i) * 50 + 60 for i in range(4)]
    colors = plt.cm.viridis(np.linspace(0, 1, 4))

    bars = ax.bar(categories, values, color=colors)
    ax.set_ylim(0, 120)
    ax.set_title(f"Sales Performance - Month {frame + 1}")

    # Add value labels on bars
    for bar, value in zip(bars, values):
        height = bar.get_height()
        ax.text(
            bar.get_x() + bar.get_width() / 2.0,
            height + 2,
            f"{value:.0f}",
            ha="center",
            va="bottom",
        )

    return bars


ani = animation.FuncAnimation(fig, animate_bars, frames=50, interval=100)
ani.save("bar_race_animation.gif", writer="pillow", fps=5)

Image("bar_race_animation.gif")

matplotlib animated bar chart race

These animations demonstrate Matplotlib’s versatility for creating engaging dynamic visualizations, from scientific data trends to business dashboards.

Publication-quality output

Matplotlib excels at creating high-resolution, publication-ready visualizations suitable for academic papers, research reports, and professional presentations. The library provides precise control over figure size, DPI, and output formats (PNG, PDF, SVG, EPS), ensuring your plots meet the strict requirements of scientific journals and publications.

# Set publication-quality parameters
plt.rcParams.update({
    'font.size': 10,           # Standard academic paper font size
    'font.family': 'serif',    # Traditional serif fonts for publications
    'axes.linewidth': 1.2,     # Thicker axes for better print visibility
    'figure.dpi': 300,         # High resolution for sharp display
    'savefig.dpi': 300,        # High resolution for saved files
    'savefig.bbox': 'tight'    # Remove extra whitespace when saving
})

fig, (ax1, ax2) = plt.subplots(1, 2)
fig.suptitle('GitHub User Analysis: Publication Ready', fontsize=14, fontweight='bold')

# Subplot 1: Distribution histogram
top_users = new_profile.sort_values('followers', ascending=False)[:50]
ax1.hist(top_users['followers'], bins=15, alpha=0.7, color='steelblue', edgecolor='black')
ax1.set_xlabel('Followers Count')
ax1.set_ylabel('Frequency')
ax1.set_title('A) Follower Distribution')
ax1.grid(True, alpha=0.3)

# Subplot 2: Correlation scatter
ax2.scatter(top_users['followers'], top_users['total_stars'], alpha=0.6, s=30)
ax2.set_xlabel('Followers')
ax2.set_ylabel('Total Stars')
ax2.set_title('B) Followers vs Stars Correlation')
ax2.grid(True, alpha=0.3)

# Save as publication-ready formats
plt.tight_layout()
plt.savefig('github_analysis.pdf', format='pdf', bbox_inches='tight')
plt.savefig('github_analysis.eps', format='eps', bbox_inches='tight')
plt.show()

matplotlib publication-ready plots

Cons

Steep learning curve

Matplotlib’s extensive functionality comes with complexity. New users often find the syntax overwhelming, especially when transitioning from point-and-click visualization tools. Understanding the figure-axes hierarchy and object-oriented vs. pyplot interfaces requires significant time investment.

Extensive styling needed for publication-ready common plots

While Matplotlib supports virtually any chart type, producing visually polished versions of standard plots like histograms, scatter plots, or bar charts requires substantial customization work.

To make common visualizations suitable for presentations or sharing, you must manually style numerous elements: axis formatting, color schemes, legends, annotations, and layout spacing. Matplotlib’s low-level interface provides complete control but assumes users will configure all visual aspects from scratch.

For example, by default, the heatmap doesn’t have the x-axis and y-axis labels and annotations.

num_features = new_profile.select_dtypes("int64")
correlation = num_features.corr()

fig, ax = plt.subplots()
im = plt.imshow(correlation, cmap="coolwarm")

matplotlib heatmap

Creating a readable heatmap in Matplotlib requires several manual steps:

  • Define the visualization layout and color scheme
  • Manually position axis ticks and labels
  • Loop through each cell to add number annotations with appropriate formatting
num_features = new_profile.select_dtypes("int64")
correlation = num_features.corr()

fig, ax = plt.subplots()
im = plt.imshow(correlation, cmap="coolwarm")


ax.set_xticks(np.arange(len(correlation.columns)))
ax.set_yticks(np.arange(len(correlation.columns)))
ax.set_xticklabels(correlation.columns)
ax.set_yticklabels(correlation.columns)

# Add number annotations manually
for i in range(len(correlation.columns)):
    for j in range(len(correlation.columns)):
        # Choose text color based on background intensity
        ax.text(
            j,
            i,
            f"{correlation.iloc[i, j]:.2f}",
            ha="center",
            va="center",
            color="black",
        )

plt.setp(ax.get_xticklabels(), rotation=45, ha="right", rotation_mode="anchor")
plt.tight_layout()
plt.show()

matplotlib annotated heatmap

Creating a basic annotated heatmap shouldn’t require this many lines of manual setup and configuration.

Manual statistical processing required

Unlike higher-level libraries like seaborn, Matplotlib doesn’t automatically handle statistical computations or data preprocessing. You must manually perform all data grouping, filtering, aggregations, and statistical calculations before plotting.

from scipy import stats
from seaborn import load_dataset

penguins = load_dataset("penguins")

# Matplotlib requires manual statistical processing for regression analysis
# Filter out missing values manually
penguins_clean = penguins.dropna(subset=["bill_length_mm", "flipper_length_mm"])

# Manual regression calculation (no automatic trend estimation)
x = penguins_clean["bill_length_mm"].values
y = penguins_clean["flipper_length_mm"].values

# Compute regression statistics manually
slope, intercept, r_value, p_value, std_err = stats.linregress(x, y)

# Manual confidence interval calculation with proper statistical formula
n = len(x)
x_mean = np.mean(x)
sxx = np.sum((x - x_mean) ** 2)
y_pred = slope * x + intercept
residuals = y - y_pred
mse = np.sum(residuals**2) / (n - 2)
std_error_regression = np.sqrt(mse)

# Create confidence bands with proper varying width
x_smooth = np.linspace(x.min(), x.max(), 100)
y_smooth = slope * x_smooth + intercept

# Calculate standard error for each point on the smooth line
se_y = std_error_regression * np.sqrt(1 / n + (x_smooth - x_mean) ** 2 / sxx)

# Use t-distribution for 95% confidence interval
t_val = stats.t.ppf(0.975, df=n - 2)  # 95% confidence
ci = t_val * se_y

# Plot with manually computed statistics
plt.figure()
plt.scatter(x, y, alpha=0.6, s=20, color="#1f77b4", label="Data points")
plt.plot(x_smooth, y_smooth, color="#1f77b4", linewidth=2, label="Regression line")
plt.fill_between(
    x_smooth,
    y_smooth - ci,
    y_smooth + ci,
    alpha=0.2,
    color="#1f77b4",
    label="95% Confidence interval",
)
plt.xlabel("bill_length_mm")
plt.ylabel("flipper_length_mm")
plt.title("Penguins Regression (Manual statistical processing required)")
plt.legend()
plt.tight_layout()
plt.show()

matplotlib penguins regression

Key Takeaways

Matplotlib is capable of producing any plot, but creating complex plots often requires more code compared to other libraries.

seaborn

seaborn is a Python data visualization library built on top of Matplotlib. It offers a higher-level interface, simplifying the process of creating visually appealing plots.

Pros

Reduced code

seaborn offers a higher-level interface that simplifies plot creation compared to Matplotlib. Since it’s designed specifically for pandas DataFrames, you can create attractive visualizations with minimal code.

For instance, using the same data as before, we can create a nice heatmap without explicitly setting the x and y labels:

import seaborn as sns

# Load penguins dataset for seaborn examples
penguins = sns.load_dataset("penguins")

# Use numeric features only for correlation
correlation = num_features.corr()

sns.heatmap(correlation, annot=True)

seaborn heatmap

This results in a more visually appealing heatmap without the need for additional configuration.

Statistical plots with automatic processing

seaborn excels at automatically performing statistical computations and aggregations, eliminating the need for manual data processing. It handles statistical estimation, uncertainty visualization, and data transformations behind the scenes.

Automatic statistical estimation with confidence intervals:

Unlike the extensive manual statistical processing required in Matplotlib (as shown above), seaborn can create a scatter plot with trend line and confidence interval using just one function call:

# Load penguins dataset and automatically estimate trend lines with confidence intervals
sns.lmplot(
    data=penguins,
    x="bill_length_mm",
    y="flipper_length_mm",
)

seaborn lmplot

Other examples of seaborn’s statistical plots:

Automatic distribution fitting and visualization:

# Automatically computes and overlays histogram, KDE, and rug plots
sns.displot(data=penguins, x="flipper_length_mm", kde=True, rug=True)

seaborn displot

Automatic marginal distribution and correlation computation:

# Automatically computes scatter plot, marginal histograms, and correlation coefficient
sns.jointplot(data=penguins, x="flipper_length_mm", y="body_mass_g", kind="reg")

seaborn jointplot

Automatic pairwise relationships:

# Automatically creates scatter plots for all numeric pairs with marginal histograms
sns.pairplot(data=penguins, hue="species", palette="deep")

seaborn pairplot

Distribution visualization with kernel density:

# Automatically fits KDE and shows distribution shape
sns.violinplot(data=penguins, x="species", y="body_mass_g", palette="coolwarm")

seaborn violinplot for penguins

Statistical summary with quartiles:

# Automatically computes median, quartiles, and outliers
sns.boxplot(data=penguins, x="species", y="body_mass_g", palette="coolwarm")

seaborn boxplot

These examples demonstrate how seaborn handles complex statistical processing automatically, saving significant manual computation and code complexity.

Cons

Limited customization compared to Matplotlib

seaborn excels at creating attractive plots quickly but sacrifices fine-grained control for simplicity. Advanced customizations like precise positioning, custom annotations, or non-standard modifications require dropping to matplotlib’s lower level, defeating seaborn’s simplicity purpose.

In the following example, while seaborn creates clean line plots effortlessly, adding highlighted periods and custom annotations requires matplotlib’s lower-level interface.

# Load flights dataset - contains smooth passenger trends
flights = sns.load_dataset("flights")

# seaborn creates smooth line plot 
sns.lineplot(data=flights, x="year", y="passengers")

# But adding business annotations requires matplotlib
ax = plt.gca()

# Highlight jet age introduction period
ax.axvspan(1955, 1958, alpha=0.2, color='green', label='Jet Age Introduction')

# Add annotation for technical achievement
ax.annotate('Jet Engine Impact', 
           xy=(1957, 400), 
           xytext=(1952, 450),
           arrowprops=dict(arrowstyle='->', color='red', lw=2),
           fontsize=12, color='red', weight='bold')

# Custom legend with mixed elements
ax.legend(['Passenger Trend', 'Jet Age'], loc='upper left')
plt.title('Air Travel Growth with Historical Context')
plt.show()

seaborn line plot with annotations

Limited plot type collection

While seaborn excels at statistical plots, it lacks the breadth of Matplotlib’s visualization options for specialized scientific or custom chart types.

No interactive or animated features

seaborn is designed exclusively for static statistical visualizations and lacks any built-in support for interactivity or animations.

Key Takeaways

seaborn is a higher-level version of Matplotlib. Even though it does not have a wide collection as Matplotlib, seaborn makes popular plots such as bar plot, box plot, heatmap, etc look pretty in less code.

Pygal

Pygal is a lightweight Python library that generates scalable vector graphics (SVG) charts. Built specifically for web applications, Pygal creates interactive visualizations with minimal dependencies and extremely fast rendering.

Pros

SVG vector graphics with perfect scaling

Pygal generates pure SVG output that scales perfectly across all devices and screen sizes. Unlike bitmap images from other libraries, SVG charts maintain crisp quality at any zoom level.

This makes Pygal ideal for responsive web applications where charts need to look sharp on both mobile devices and large desktop monitors:

import pygal

# Create a bar chart showing top GitHub users by followers
top_followers = new_profile.sort_values(by="followers", ascending=False)[:10]

bar_chart = pygal.Bar(
    title='Top 10 GitHub Users by Followers',
    x_title='Users',
    y_title='Followers'
)
bar_chart.x_labels = top_followers['user_name'].tolist()
bar_chart.add('Followers', top_followers['followers'].tolist())

# Save chart as SVG file
bar_chart.render_to_file('github_top_users.svg')

pygal bar chart

The resulting SVG can be embedded directly into HTML without additional dependencies or image files.

Built-in interactivity with hover tooltips

Every Pygal chart includes hover tooltips by default through native SVG interactivity. Users can explore data points without requiring additional JavaScript libraries, frameworks, or configuration.

The bar chart above automatically displays precise follower counts when you hover over each bar, providing instant data exploration without any setup required.

To see the chart in a browser, we can use the following code:

bar_chart.render_in_browser()

pygal bar chart in browser

To see Pygal’s interactivity in Jupyter notebooks, include the following JavaScript dependencies:

# For interactive display in notebooks, we need to wrap the SVG with JavaScript
from IPython.display import display, HTML

def display_pygal_chart(chart):
    """Display a Pygal chart with full interactivity in Jupyter notebooks"""
    html_template = """
    <!DOCTYPE html>
    <html>
      <head>
        <script type="text/javascript" src="http://kozea.github.com/pygal.js/javascripts/svg.jquery.js"></script>
        <script type="text/javascript" src="https://kozea.github.io/pygal.js/2.0.x/pygal-tooltips.min.js"></script>
      </head>
      <body>
        <figure>{chart}</figure>
      </body>
    </html>
    """
    rendered = chart.render(is_unicode=True)
    display(HTML(html_template.format(chart=rendered)))

# Display the chart with full interactivity (hover tooltips, etc.)
display_pygal_chart(bar_chart)

Professional styling for common chart types

One of Pygal’s key advantages is how it automatically enhances common chart types with professional aesthetics and built-in interactivity. Here are some examples:

Radar Chart – Multi-Dimensional User Comparison

Compare multiple users across different metrics simultaneously:

# Create radar chart for multi-dimensional comparison
top_5_users = new_profile.sort_values(by="followers", ascending=False)[:5]

radar_chart = pygal.Radar(title="Top 5 Users: Multi-Metric Comparison", fill=True)

# Normalize metrics to 0-100 scale for better comparison
max_followers = top_5_users["followers"].max()
max_stars = top_5_users["total_stars"].max()
max_forks = top_5_users["forks"].max()
max_contrib = top_5_users["contribution"].max()

for _, user in top_5_users.iterrows():
    radar_chart.add(
        user["user_name"],
        [
            (user["followers"] / max_followers) * 100,
            (user["total_stars"] / max_stars) * 100,
            (user["forks"] / max_forks) * 100,
            (user["contribution"] / max_contrib) * 100,
        ],
    )

radar_chart.x_labels = ["Followers", "Stars", "Forks", "Contributions"]
display_pygal_chart(radar_chart)

pygal radar chart

Box Plot – Age Distribution by Passenger Class

Compare age distributions across different Titanic passenger classes:

# Load titanic dataset
titanic = sns.load_dataset("titanic")

# Filter out passengers with missing age data
titanic_with_age = titanic.dropna(subset=["age"])

# Create box plot comparing age across passenger classes
box_plot = pygal.Box(
    title="Titanic: Age Distribution by Passenger Class",
    y_title="Age (years)",
    box_mode="tukey",  # Shows outliers beyond 1.5 IQR
)

# Get passenger classes and add data for each
classes = ["First", "Second", "Third"]
for class_name in classes:
    class_passengers = titanic_with_age[titanic_with_age["class"] == class_name]
    age_data = class_passengers["age"].tolist()  # Pygal expects lists

    # Add class with passenger count for context
    passenger_count = len(class_passengers)
    box_plot.add(f"{class_name} Class", age_data)

display_pygal_chart(box_plot)

pygal box plot

Minimal dependencies and fast loading

Pygal requires minimal dependencies compared to other visualization libraries, making it ideal for lightweight applications where deployment size and startup speed matter.

Cons

Limited to 14 basic chart types

Pygal’s biggest limitation is its restricted chart type collection. With only 14 basic options (bar, line, pie, scatter, etc.), it lacks advanced statistical visualizations like violin plots, heatmaps, or complex multi-axis charts that are essential for sophisticated data analysis and scientific visualization.

Key Takeaways

Pygal excels when you need lightweight, scalable charts for web applications. Its SVG output, built-in interactivity, and minimal dependencies make it perfect for responsive dashboards, but consider other libraries for complex statistical analysis or specialized chart types.

Plotly

Plotly’s Python graphing library provides an effortless way to create interactive and high-quality graphs. It offers a range of chart types similar to Matplotlib and seaborn, including line plots, scatter plots, area charts, bar charts, and more.

Pros

Easily create beautiful interactive plots

Plotly excels at creating interactive visualizations with minimal code. Plotly Express makes this especially easy, allowing you to create beautiful interactive plots with just a single line of Python code:

import plotly.express as px

fig = px.scatter(
    new_profile[:100],
    x="followers",
    y="total_stars",
    color="forks",
    size="contribution",
)
fig.show()

plotly scatter plot

Simplicity in complex plots

Plotly simplifies the creation of complex plots that might be challenging with other libraries.

For example, if we want to visualize the locations of GitHub users on a map provided with their latitudes and longitudes, we can plot the locations on a map in a single line of code:

location_df = pd.read_csv(
    "https://gist.githubusercontent.com/khuyentran1401/ce61bbad3bc636bf2548d70d197a0e3f/raw/ab1b1a832c6f3e01590a16231ba25ca5a3d761f3/location_df.csv",
    index_col=0,
)

m = px.scatter_geo(
    location_df,
    lat="latitude",
    lon="longitude",
    color="total_stars",
    size="forks",
    hover_data=["user_name", "followers"],
    title="Locations of Top Users",
)

m.show()

plotly scatter plot with map

In this example, the color of the bubbles represents the number of stars, while the size corresponds to the number of forks.

Business intelligence features

Plotly offers enterprise-grade visualization features including interactive drill-down charts, real-time data updates, and cross-filtering capabilities. Business users can create comprehensive dashboards that allow stakeholders to explore data relationships, filter across multiple dimensions, and export insights for presentations.

The following example shows an interactive sunburst chart for hierarchical data exploration. Users can click through data layers from regions to departments and view detailed revenue breakdowns on hover.


# Create hierarchical data for drill-down
df = pd.DataFrame({
    'Region': ['North', 'North', 'South', 'South'],
    'Department': ['Sales', 'Marketing', 'Sales', 'Marketing'],
    'Revenue': [250000, 180000, 200000, 150000],
    'Quarter': ['Q4', 'Q4', 'Q4', 'Q4']
})

# Sunburst chart for hierarchical drill-down
fig = px.sunburst(df, path=['Region', 'Department'], 
                  values='Revenue',
                  title='Revenue Drill-down: Region → Department')

# Add crossfilter-style hover interactions
fig.update_traces(textinfo="label+percent parent")
fig.update_layout(height=500)
fig.show()

plotly sunburst chart

The following example shows an animated bubble chart with timeline controls. Users can navigate through decades of data evolution with custom timeline controls, play/pause buttons, and range sliders.


df = px.data.gapminder()

fig = px.scatter(
    df,
    x="gdpPercap",
    y="lifeExp",
    animation_frame="year",
    size="pop",
    color="continent",
    log_x=True,
    size_max=55,
    range_x=[100, 100000],
    range_y=[25, 90],
    title="GDP vs Life Expectancy by Year",
)


fig.show()

plotly animated bubble chart

Cons

Heavy dependencies

Plotly comes with substantial dependencies that can significantly increase your project’s size and deployment complexity. The full Plotly package includes multiple rendering engines, which may be overkill for simple visualization needs and can slow down application startup times.

Key Takeaways

Plotly excels at creating interactive and publication-quality visualizations with minimal code required. While it offers a wide range of visualizations and simplifies complex plots, consider the substantial dependencies that can increase project size and deployment complexity.

Altair

Altair is a powerful declarative statistical visualization library for Python that is based on Vega-Lite. It shines when it comes to creating plots that require extensive statistical transformations.

Pros

Simple visualization grammar

Altair utilizes intuitive grammar for creating visualizations. You only need to specify the links between data columns and encoding channels, and the rest of the plotting is handled automatically. This simplicity makes visualizing information fast and intuitive.

For instance, to count the number of people in each class using the Titanic dataset:

import altair as alt

titanic = sns.load_dataset("titanic")

alt.Chart(titanic).mark_bar().encode(alt.X("class"), y="count()")

altair bar chart

Altair’s concise syntax allows you to focus on the data and its relationships, resulting in efficient and expressive visualizations.

Easy data transformation

Altair makes it effortless to perform data transformations while creating charts.

For example, if you want to find the average age of each sex in the Titanic dataset, you can perform the transformation within the code itself:

hireable = (
    alt.Chart(titanic)
    .mark_bar()
    .encode(x="sex:N", y="mean_age:Q")
    .transform_aggregate(mean_age="mean(age)", groupby=["sex"])
)

hireable

altair bar chart with transformation

Altair’s transform_aggregate() function enables you to aggregate data on the fly and use the results in your visualization.

You can also specify the data type, such as nominal (categorical data without any order) or quantitative (measures of values), using the :N or :Q notation.

See a full list of data transformations here.

Linked plots

Altair provides impressive capabilities for linking multiple plots together. You can use selections to filter the contents of the attached plots based on user interactions.

For example, to visualize the number of people in each class within a selected interval on a scatter plot:

brush = alt.selection_interval()

points = (
    alt.Chart(titanic)
    .mark_point()
    .encode(
        x="age:Q",
        y="fare:Q",
        color=alt.condition(brush, "class:N", alt.value("lightgray")),
    )
    .add_params(brush)
)

bars = (
    alt.Chart(titanic)
    .mark_bar()
    .encode(y="class:N", color="class:N", x="count(class):Q")
    .transform_filter(brush)
)

points & bars

altair linked plots

As you select an interval within the scatter plot, the bar chart dynamically updates to reflect the filtered data. Altair’s ability to link plots allows for highly interactive visualizations with on-the-fly calculations, without the need for a running Python server.

Cons

Limited styling options

Altair’s simple charts, such as bar charts, may not look as styled as those in libraries like seaborn or Plotly unless you specify custom styling.

Dataset size limitations

Altair recommends aggregating your data prior to visualization when dealing with datasets exceeding 5000 samples. Handling larger datasets may require additional steps to manage data size and complexity.

Key Takeaways

Altair excels at statistical visualization with intuitive grammar and linked plots that enable interactive exploration. While it simplifies complex data transformations, its styling limitations and constraints with large datasets may require workarounds for specialized visualization needs.

Bokeh

Bokeh is a highly flexible interactive visualization library designed for web browsers.

Pros

Interactive version of Matplotlib

Bokeh stands out as the most similar library to Matplotlib when it comes to interactive visualization. While Matplotlib is a low-level visualization library, Bokeh offers both high-level and low-level interfaces. With Bokeh, you can create sophisticated plots similar to Matplotlib but with fewer lines of code and higher resolution.

For example, the circle plot of Matplotlib:

fig, ax = plt.subplots()

x = [1, 2, 3, 4, 5]
y = [2, 5, 8, 2, 7]

for x, y in zip(x, y):
    ax.add_patch(
        plt.Circle((x, y), 0.5, edgecolor="#f03b20", facecolor="#9ebcda", alpha=0.8)
    )

# Use adjustable='box-forced' to make the plot area square-shaped as well.
ax.set_aspect("equal", adjustable="datalim")
ax.set_xbound(3, 4)

ax.plot()  # Causes an autoscale update.
plt.show()

matplotlib circle plot

…can be achieved with better resolution and interactivity using Bokeh:

from bokeh.io import show, output_notebook
from bokeh.models.glyphs import Scatter
from bokeh.plotting import figure

output_notebook()

plot = figure(tools="tap", title="Select a circle")
renderer = plot.scatter([1, 2, 3, 4, 5], [2, 5, 8, 2, 7], size=50, marker="circle")

selected_circle = Scatter(size=50, fill_alpha=1, fill_color="firebrick", line_color=None, marker="circle")
nonselected_circle = Scatter(size=50, fill_alpha=0.2, fill_color="blue", line_color="firebrick", marker="circle")

renderer.selection_glyph = selected_circle
renderer.nonselection_glyph = nonselected_circle

show(plot)

bokeh circle plot

Link between plots

Bokeh makes it incredibly easy to establish links between plots. Changes applied to one plot can be automatically reflected in another plot with similar variables. This feature allows for exploring relationships between multiple plots.

For instance, if you create three graphs side by side and want to observe their relationship, you can utilize linked brushing:

from bokeh.layouts import gridplot
from bokeh.models import ColumnDataSource

source = ColumnDataSource(new_profile)

TOOLS = "box_select,lasso_select,help"
TOOLTIPS = [
    ("user", "@user_name"),
    ("followers", "@followers"),
    ("following", "@following"),
    ("forks", "@forks"),
    ("contribution", "@contribution"),
]

s1 = figure(tooltips=TOOLTIPS, title=None, tools=TOOLS)
s1.scatter(x="followers", y="following", source=source)

s2 = figure(tooltips=TOOLTIPS, title=None, tools=TOOLS)
s2.scatter(x="followers", y="forks", source=source)

s3 = figure(tooltips=TOOLTIPS, title=None, tools=TOOLS)
s3.scatter(x="followers", y="contribution", source=source)

p = gridplot([[s1, s2, s3]])
show(p)

By utilizing ColumnDataSource, the data can be shared among plots. Thus, when a change is applied to one plot, the other plots automatically update accordingly.

bokeh linked plots

Fine-grained control over web deployment

Bokeh offers exceptional control over visualization deployment in web applications. You can embed plots in existing websites, create standalone HTML files, or build complete web applications with custom servers, providing seamless integration flexibility.

Here’s how to embed a Bokeh plot in your existing website. First, create your Bokeh visualization as usual:

from bokeh.plotting import figure

# Create sample sales data
df = pd.DataFrame({
    'sales': [100, 150, 200, 120, 180, 250, 300, 220],
    'profit': [20, 30, 45, 25, 40, 60, 75, 50],
    'category': ['A', 'B', 'A', 'B', 'A', 'B', 'A', 'B']
})
source = ColumnDataSource(df)

# Create plot
p = figure(title="Sales Analysis Widget", width=500, height=300)
p.scatter('sales', 'profit', source=source, size=10, alpha=0.6)

Next, generate the HTML components for web embedding:

from bokeh.embed import components

# Generate components for embedding
script, div = components(p)

Finally, integrate these components into your website. Add the Bokeh CDN to your HTML template’s head section:

<head>
    <script src="https://cdn.bokeh.org/bokeh/release/bokeh-3.3.0.min.js"></script>
</head>

Then include the plot components in your page body where you want the visualization to appear:

<body>
    <!-- Your existing HTML content -->
    {{ plot_div|safe }}    <!-- Insert the div component -->
    {{ plot_script|safe }} <!-- Insert the script component -->
</body>

Cons

Verbose code requirements

While Bokeh offers powerful customization, it demands considerably more setup code to make even simple plots look professional compared to simpler alternatives like seaborn and Plotly.

For the same titanic count plot, you must transform the data beforehand and manually configure bar width and colors to achieve an attractive result.

If we didn’t add width for the bar graph, the graph would look like this:

from bokeh.transform import factor_cmap
from bokeh.palettes import Spectral6

titanic_groupby = titanic.groupby("class")["survived"].sum().reset_index()

p = figure(x_range=list(titanic_groupby["class"]))
p.vbar(
    x="class",
    top="survived",
    source=titanic_groupby,
    fill_color=factor_cmap(
        "class", palette=Spectral6, factors=list(titanic_groupby["class"])
    ),
)
show(p)

bokeh bar plot

Thus, we need to manually adjust the dimensions to make the plot nicer:

p = figure(x_range=list(titanic_groupby["class"]))
p.vbar(
    x="class",
    top="survived",
    width=0.9,
    source=titanic_groupby,
    fill_color=factor_cmap(
        "class", palette=Spectral6, factors=list(titanic_groupby["class"])
    ),
)
show(p)

bokeh bar plot with adjusted width

Key Takeaways

Bokeh excels at web deployment with flexible integration options for existing websites, standalone files, and custom servers. However, its comprehensive capabilities often require more verbose code compared to higher-level libraries like Plotly or seaborn when creating similar visualizations.

Conclusion

Congratulations! You’ve explored six powerful Python visualization libraries, each excelling in different scenarios:

Matplotlib – Choose for complete customization control and publication-quality static plots
seaborn – Select for statistical analysis and elegant plots with minimal code
Plotly – Use for interactive dashboards and web-ready visualizations
Altair – Pick for declarative data exploration through grammar of graphics
Pygal – Opt for lightweight web integration and simple SVG charts
Bokeh – Go with for complex web applications and flexible deployment options

Match your project’s specific requirements – whether it’s interactivity, customization level, or deployment target – to select the optimal library.

Related Resources

  • Scale your data processing with Polars for high-performance DataFrame operations
  • Enhance your development workflow with Marimo notebooks for reproducible visualization creation

2 thoughts on “Top 6 Python Libraries for Visualization: Which One to Use?”

Leave a Comment

Your email address will not be published. Required fields are marked *

0
    0
    Your Cart
    Your cart is empty
    Scroll to Top

    Work with Khuyen Tran

    Work with Khuyen Tran