我想进行K折交叉验证。 之前 K字交叉验证的代码是这样的:并且运行良好
df = pd.read_csv('finalupdatedothers-multilabel.csv')
X= df[['sentences']]
dfy = df[['ADR','WD','EF','INF','SSI','DI','others']]
df1 = dfy.stack().reset_index()
df1.columns = ['a','b','c']
y_train_text = df1.groupby('a')['b'].apply(list)
lb = preprocessing.MultiLabelBinarizer()
# Run classifier
stop_words = stopwords.words('english')
classifier=make_pipeline(CountVectorizer(),
TfidfTransformer(),
#SelectKBest(chi2, k=4),
OneVsRestClassifier(SGDClassifier()))
#combined_features = FeatureUnion([("pca", pca), ("univ_select", selection)])
random_state = np.random.RandomState(0)
# Split into training and test
X_train, X_test, y_train, y_test = train_test_split(X_train, y_train_text, test_size=.2,
random_state=random_state)
print y_train
# # Binarize the output classes
Y = lb.fit_transform(y_train)
Y_test=lb.transform(y_test)
classifier.fit(X_train, Y)
y_score = classifier.fit(X_train, Y).decision_function(X_test)
print ("y_score"+str(y_score))
predicted = classifier.predict(X_test)
all_labels = lb.inverse_transform(predicted)
#print accuracy_score
print ("accuracy : "+str(accuracy_score(Y_test, predicted)))
print ("micro f-measure "+str(f1_score(Y_test, predicted, average='weighted')))
print("precision"+str(precision_score(Y_test,predicted,average='weighted')))
print("recall"+str(recall_score(Y_test,predicted,average='weighted')))
for item, labels in zip(X_test, all_labels):
print ('%s => %s' % (item, ', '.join(labels)))
当我更改代码以使用k倍交叉验证而不是train_tes_split时。我收到此错误:
ValueError: Found input variables with inconsistent numbers of samples: [1, 6008]
已使用iloc更新 我使用k折交叉验证的代码如下:
kf = KFold(n_splits=10)
kf.get_n_splits(X)
KFold(n_splits=2, random_state=None, shuffle=False)
for train_index, test_index in kf.split(X):
X_train, X_test = X.iloc[train_index], X.iloc[test_index]
y_train, y_test = y_train_text.iloc[train_index],
y_train_text.iloc[test_index]
您能告诉我我做错了什么吗?
我的数据如下:
,sentences,ADR,WD,EF,INF,SSI,DI,others
0,"extreme weight gain, short-term memory loss, hair loss.",1.0,,,,,,
1,I am detoxing from Lexapro now.,,,,,,,1.0
2,I slowly cut my dosage over several months and took vitamin supplements to help.,,,,,,,1.0