Testing Archives

pytest-postgresql: Database Testing with pytest

Leave a Comment / SQL, Testing / Khuyen Tran

If you want to incorporate database testing seamlessly within your pytest test suite, use pytest-postgresql.

pytest-postgres provides fixtures that manage the setup and cleanup of test databases, ensuring repeatable tests. Additionally, each test runs in isolation, preventing any impact on the production database from testing changes.

pytest-postgresql: Database Testing with pytest Read More »

Backtesting: Assess Trading Strategy Performance Effortlessly in Python

Leave a Comment / Testing, Time Series / Khuyen Tran

Evaluating trading strategies’ effectiveness is crucial for financial decision-making, but it’s challenging due to the complexities of historical data analysis and strategy testing.

Backtesting allows users to simulate trades based on historical data and visualize the outcomes through interactive plots in three lines of code.

To see how Backtesting works, let’s create our first strategy to backtest on these Google data, a simple moving average (MA) cross-over strategy.

from backtesting.test import GOOG

GOOG.tail()

Open High Low Close Volume
2013-02-25 802.3 808.41 790.49 790.77 2303900
2013-02-26 795.0 795.95 784.40 790.13 2202500
2013-02-27 794.8 804.75 791.11 799.78 2026100
2013-02-28 801.1 806.99 801.03 801.20 2265800
2013-03-01 797.8 807.14 796.15 806.19 2175400

import pandas as pd

def SMA(values, n):
"""
Return simple moving average of `values`, at
each step taking into account `n` previous values.
"""
return pd.Series(values).rolling(n).mean()

from backtesting import Strategy
from backtesting.lib import crossover

class SmaCross(Strategy):
# Define the two MA lags as *class variables*
# for later optimization
n1 = 10
n2 = 20

def init(self):
# Precompute the two moving averages
self.sma1 = self.I(SMA, self.data.Close, self.n1)
self.sma2 = self.I(SMA, self.data.Close, self.n2)

def next(self):
# If sma1 crosses above sma2, close any existing
# short trades, and buy the asset
if crossover(self.sma1, self.sma2):
self.position.close()
self.buy()

# Else, if sma1 crosses below sma2, close any existing
# long trades, and sell the asset
elif crossover(self.sma2, self.sma1):
self.position.close()
self.sell()

To assess the performance of our investment strategy, we will instantiate a Backtest object, using Google stock data as our asset of interest and incorporating the SmaCross strategy class. We’ll start with an initial cash balance of 10,000 units and set the broker’s commission to a realistic rate of 0.2%.

from backtesting import Backtest

bt = Backtest(GOOG, SmaCross, cash=10_000, commission=.002)
stats = bt.run()
stats

Start 2004-08-19 00:00:00
End 2013-03-01 00:00:00
Duration 3116 days 00:00:00
Exposure Time [%] 97.067039
Equity Final [$] 68221.96986
Equity Peak [$] 68991.21986
Return [%] 582.219699
Buy & Hold Return [%] 703.458242
Return (Ann.) [%] 25.266427
Volatility (Ann.) [%] 38.383008
Sharpe Ratio 0.658271
Sortino Ratio 1.288779
Calmar Ratio 0.763748
Max. Drawdown [%] -33.082172
Avg. Drawdown [%] -5.581506
Max. Drawdown Duration 688 days 00:00:00
Avg. Drawdown Duration 41 days 00:00:00
# Trades 94
Win Rate [%] 54.255319
Best Trade [%] 57.11931
Worst Trade [%] -16.629898
Avg. Trade [%] 2.074326
Max. Trade Duration 121 days 00:00:00
Avg. Trade Duration 33 days 00:00:00
Profit Factor 2.190805
Expectancy [%] 2.606294
SQN 1.990216
_strategy SmaCross
_equity_curve …
_trades Size EntryB…
dtype: object

Plot the outcomes:

bt.plot()

Link to Backtesting.

Run in Google Colab.

Backtesting: Assess Trading Strategy Performance Effortlessly in Python Read More »

Simplify Unit Testing of SQL Queries with PySpark

Leave a Comment / PySpark, SQL, Testing / Khuyen Tran

Testing your SQL queries helps to ensure that they are correct and functioning as intended.

PySpark enables users to parameterize queries, which simplifies unit testing of SQL queries. In this example, the df and amount variables are parameterized to verify whether the actual_df matches the expected_df.

Learn more about parameterized queries in PySpark.

Simplify Unit Testing of SQL Queries with PySpark Read More »

Organize and Control Test Execution using pytest.mark

Leave a Comment / Testing / Khuyen Tran

pytest.mark lets you label test functions for conditional or selective execution based on specific needs.

For instance, you can mark slow tests or tests involving integration with external services to run them separately or exclude them from regular test runs. This helps you organize and execute your tests more effectively.

Organize and Control Test Execution using pytest.mark Read More »

tmp_path: Create a Temporary Directory for Testing

Leave a Comment / Testing / Khuyen Tran

Use the tmp_path fixture in pytest to create a temporary directory for testing the function that interacts with files. This will prevent any changes to the actual filesystem or production files.

tmp_path: Create a Temporary Directory for Testing Read More »

Write Better Code with Test-Driven Development

Leave a Comment / Testing / Khuyen Tran

Test-driven development (TDD) is a technique that helps you write better code, faster and with more confidence. It emphasizes writing automated tests before writing the code. Here’s the process:

Create a test that fails because the code doesn’t exist yet.

Write the minimum amount of code necessary to make the test pass.

Refactor the code to make it more maintainable and efficient.

Repeat the process until your code meets the desired behaviors.

My articles on the topic of testing.

Write Better Code with Test-Driven Development Read More »

Test for Specific Exceptions in Unit Testing

Leave a Comment / Testing / Khuyen Tran

To test for a specific exception in unit testing, use pytest.raises.

For example, you can use it to test if a ValueError is thrown when there are NaN values in the group column.

Test for Specific Exceptions in Unit Testing Read More »

Efficiently Generate Falsified Examples for Unit Tests with Pandera and Hypothesis

Leave a Comment / Pandas, Testing / Khuyen Tran

Generating readable edge cases for unit tests can often be a challenging task. However, with the combined power of Pandera and Hypothesis, you can efficiently detect falsified examples and write cleaner tests.

Pandera allows you to define constraints for inputs and outputs, while Hypothesis automatically identifies edge cases that match the specified schema.

Hypothesis further simplifies complex examples until it finds a smaller example that still reproduces the issue.

Efficiently Generate Falsified Examples for Unit Tests with Pandera and Hypothesis Read More »