Question

运行标准缩放器然后运行分类器会产生与使用管道相同的结果吗？

嗨，我有分类问题并试图使用scikit learn的StandardScaler（）来扩展X变量。如果它们在理论上产生相同的结果，我会看到两种选择吗？因为当我使用选项（1）时，我的测试数据集的精度得分会更高。

（1）

scalar = StandardScaler()
xtrain_ = scalar.fit_transform(xtrain)
RFC = RandomForestClassifier(n_estimators=100)
RFC.fit(xtrain. ytrain)

xtest_ = scalar.transform(xtest)
score = cross_val_score(RFC, xtest_, ytest,cv=10, scoring ='precision')

（2）

RFCs = Pipeline([("scale", StandardScaler()), ("rf", RandomForestClassifier(n_estimators=100))])
RFCs.fit(xtrain, ytrain)
scores = cross_val_score(RFCs, xytest, ytest, cv=10, scoring='precision')

Answer 1

您的选项号2使用的数据集（xytest）与版本号（1）不同，后者使用xtest。此外，您的交叉验证应包括培训，而不仅仅是预测。

除此之外，它们应该是相同的，而我建议你使用管道。

是否相同使用1）StandardScaler＆amp;分类器vs 2）管道（标量，分类器）？

1 个答案: