Transformers and the transform() Method
Transformers in scikit-learn are tools that modify data by applying transformations such as scaling, normalization, or encoding, to prepare it for modeling. Each transformer follows a consistent interface, using the fit()
method to learn any necessary parameters from the data and the transform()
method to apply those transformations. For instance, StandardScaler()
calculates the mean and standard deviation during fit()
and uses those values to transform the data by scaling it (if you remember back to high school statistics, this transformed value is called a z-score).
from sklearn.preprocessing import StandardScaler
import numpy as np
# Example data
X = np.array([[1, 2], [3, 4], [5, 6]])
# Create a StandardScaler instance
scaler = StandardScaler()
# Fit the scaler on the data
scaler.fit(X)
# Transform the data
X_scaled = scaler.transform(X)
print(X_scaled)
Another common shortcut like we saw before, fit_transform()
, allows users to perform both...