所以我想在将它们拆分(使用分层)后对Train&test Dataset进行均值目标编码,并且要这样做,必须将它们重新合并在一起。
我该怎么做?任何建议将不胜感激? ,谢谢U。
X_train, X_test, y_train, y_test = train_test_split(R,
target,
test_size=0.25,
random_state=7,
stratify=target)
print("Number transactions X_train dataset: ", X_train.shape)
print("Number transactions y_train dataset: ", y_train.shape)
print("Number transactions X_test dataset: ", X_test.shape)
print("Number transactions y_test dataset: ", y_test.shape)
以下是输出:
Number transactions X_train dataset: (37779, 89)
Number transactions y_train dataset: (37779,)
Number transactions X_test dataset: (12593, 89)
Number transactions y_test dataset: (12593,)
答案 0 :(得分:0)
您可以按行连接数组:
X_combined = np.r_[X_train, X_test]
y_combined = np.r_[y_train, y_test]
您可以在other SO问题中看到更多的深度。