Pipeline + GridSearchCV: Prevent Data Leakage when Scaling the Data

Pipeline + GridSearchCV: Prevent Data Leakage when Scaling the Data

Scaling the data before using GridSearchCV can lead to data leakage since the scaling tells some information about the entire data.

To prevent this, assemble both the scaler and machine learning models in a pipeline and then use it as the estimator for GridSearchCV.

My previous tips on machine learning.

Search

Related Posts

Related Posts

Scroll to Top

Work with Khuyen Tran

Work with Khuyen Tran