Motivation
Data scientists and analysts often need to share their Python applications with non-technical users who don’t have Python installed. Distributing Python scripts that require installing Python, managing dependencies, and setting up environments creates barriers for end-users.
Example:
# Instructions for end users:
# 1. Install Python 3.8+
# 2. Create virtual environment
# 3. Install requirements:
pip install pandas numpy scikit-learn matplotlib
# 4. Run the script:
python simple_df.py
# This process is complex for non-technical users
Introduction to PyInstaller
PyInstaller is a tool that bundles a Python application and all its dependencies into a single package or executable file. This allows end-users to run the application without installing Python or any additional dependencies.
Installation:
pip install pyinstaller
Creating Standalone Executables
PyInstaller solves the distribution problem by:
- Analyzing your code to identify all required dependencies
- Collecting all necessary files including the Python interpreter
- Creating a single executable or folder that can run on the target platform
Here’s an example of using PyInstaller:
Create a simple script that prints a pandas DataFrame:
# simple_df.py
import pandas as pd
df = pd.DataFrame({
'Name': ['John', 'Anna', 'Peter'],
'Age': [28, 24, 35],
'City': ['New York', 'Paris', 'London']
}
print("\nEmployee Information:")
print("===================")
print(df)
Create the executable:
# Basic usage
pyinstaller simple_df.py
# Create single file executable
pyinstaller --onefile simple_df.py
The executable will be created in the dist
directory. Users can run it without installing Python:
# Windows
dist\simple_df.exe
# Linux/Mac
./dist/simple_df
Output:
Employee Information:
===================
Name Age City
0 John 28 New York
1 Anna 24 Paris
2 Peter 35 London
Conclusion
PyInstaller makes it easy to distribute Python applications to non-technical users by creating standalone executables. This is particularly useful for data scientists who want to share their analysis tools with stakeholders who don’t have Python expertise.