Automate Time Series Feature Engineering with tsfresh

Data scientists spend much of their time cleaning data or building features. While the former is unavoidable, the latter can be automated.

tsfresh uses a robust feature selection algorithm to automatically extract relevant time series features, freeing up data scientists’ time.

To demonstrate this, start with loading an example dataset:

from tsfresh.examples.robot_execution_failures import (
    download_robot_execution_failures,
    load_robot_execution_failures,
)

download_robot_execution_failures()
timeseries, y = load_robot_execution_failures()
timeseries.head()

Output:

  id  time  F_x  F_y  F_z  T_x  T_y  T_z
0   1     0   -1   -1   63   -3   -1    0
1   1     1    0    0   62   -3   -1    0
2   1     2   -1   -1   61   -3    0    0
3   1     3   -1   -1   63   -2   -1    0
4   1     4   -1   -1   63   -3   -1    0

Extract features and select only relevant features for each time series.

from tsfresh import extract_relevant_features

# extract relevant features
features_filtered = extract_relevant_features(
    timeseries, y, column_id="id", column_sort="time"
)

You can now use the features in features_filtered to train your classification model.

Link to tsfresh.

Interact with this code in Google Colab.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top