Split Data in a Stratified Fashion in scikit-learn

When using scikit-learn’s train_test_split, if you want to keep the proportion of classes in the sample the same as the proportion of classes in the entire dataset, use stratify=y

My previous tips on feature engineering in Python.

Scroll to Top