Enhance Code Modularity and Reusability with Temporary Views in PySpark

In PySpark, temporary views are virtual tables that can be queried using SQL, enabling code reusability and modularity.

To demonstrate this, let’s create a PySpark DataFrame called orders_df.

# Create a sample DataFrame
data = [
    (1001, "John Doe", 500.0),
    (1002, "Jane Smith", 750.0),
    (1003, "Bob Johnson", 300.0),
    (1004, "Sarah Lee", 400.0),
    (1005, "Tom Wilson", 600.0),
]

columns = ["customer_id", "customer_name", "revenue"]
orders_df = spark.createDataFrame(data, columns)

Next, create a temporary view called orders from the orders_df DataFrame using the createOrReplaceTempView method.

# Create a temporary view
orders_df.createOrReplaceTempView("orders")

With the temporary view created, we can perform various operations on it using SQL queries.

# Perform operations on the temporary view
total_revenue = spark.sql("SELECT SUM(revenue) AS total_revenue FROM orders")
order_count = spark.sql("SELECT COUNT(*) AS order_count FROM orders")

# Display the results
print("Total Revenue:")
total_revenue.show()

print("\nNumber of Orders:")
order_count.show()

Output:

Total Revenue:
+-------------+
|total_revenue|
+-------------+
|       2550.0|
+-------------+


Number of Orders:
+-----------+
|order_count|
+-----------+
|          5|
+-----------+

Run in Google Colab.

Search

PySpark

Make PySpark Queries Cleaner with Column Aliasing

April 20, 2025

SQL

Simplify SQL Parsing and Transpilation with SQLGlot

April 15, 2025

PySpark

Update Multiple Columns in Spark 3.3 and Later

April 6, 2025

Enhance Code Modularity and Reusability with Temporary Views in PySpark

Table of Contents

Enhance Code Modularity and Reusability with Temporary Views in PySpark

Search

Related Posts

Make PySpark Queries Cleaner with Column Aliasing

Simplify SQL Parsing and Transpilation with SQLGlot

Update Multiple Columns in Spark 3.3 and Later

Stay up-to-date with
data skills using
CodeCut

Drop a line

Get in touch

Follow Us on Social Media

Enhance Code Modularity and Reusability with Temporary Views in PySpark

Table of Contents

Enhance Code Modularity and Reusability with Temporary Views in PySpark

Search

Related Posts

Make PySpark Queries Cleaner with Column Aliasing

Simplify SQL Parsing and Transpilation with SQLGlot

Update Multiple Columns in Spark 3.3 and Later

Stay up-to-date with data skills using CodeCut

Drop a line

Get in touch

Follow Us on Social Media

Work with Khuyen Tran

Work with Khuyen Tran

Stay up-to-date with
data skills using
CodeCut