Data scientists spend much of their time cleaning data or building features. While the former is unavoidable, the latter can be automated.
tsfresh uses a robust feature selection algorithm to automatically extract relevant time series features, freeing up data scientists’ time.
To demonstrate this, start with loading an example dataset:
from tsfresh.examples.robot_execution_failures import (
download_robot_execution_failures,
load_robot_execution_failures,
)
download_robot_execution_failures()
timeseries, y = load_robot_execution_failures()
timeseries.head()
Output:
id time F_x F_y F_z T_x T_y T_z
0 1 0 -1 -1 63 -3 -1 0
1 1 1 0 0 62 -3 -1 0
2 1 2 -1 -1 61 -3 0 0
3 1 3 -1 -1 63 -2 -1 0
4 1 4 -1 -1 63 -3 -1 0
Extract features and select only relevant features for each time series.
from tsfresh import extract_relevant_features
# extract relevant features
features_filtered = extract_relevant_features(
timeseries, y, column_id="id", column_sort="time"
)
You can now use the features in features_filtered
to train your classification model.