我在使用pandas加载数据集进行一些机器学习时遇到了一个罕见的错误。 这是我得到的错误:
我一直在阅读与之相关的内容,这似乎是由于列和大熊猫如何解释它们,但我不知道哪些是错的。 这是我正在使用的代码:
argsOK <- and <$> mapM check args
when (not argsOK) exitFailure
-- rest of your do-block
提前致谢
答案 0 :(得分:0)
在clf.fit
方法中,根据预期的文档参数为array
:
Parameters
----------
X : array-like, shape (n_samples, n_features)
Training data.
y : array, shape (n_samples,)
如果您查看link X
中的示例,y
numpy array
是as_matrix()
:
尝试对X
和y
使用X = pima[feature_cols]
,而不仅仅使用y = pima.label
和X = pima[feature_cols].as_matrix()
y = pima.label.as_matrix()
print
按import pandas as pd
url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes/pima-indians-diabetes.data'
col_names = ['pregnant', 'glucose', 'bp', 'skin', 'insulin', 'bmi', 'pedigree', 'age', 'label']
pima = pd.read_csv(url, header=None, names=col_names)
# define X and y
feature_cols = ['pregnant', 'glucose', 'bp', 'skin', 'insulin', 'bmi', 'pedigree', 'age']
X = pima[feature_cols].as_matrix()
y = pima.label.as_matrix()
#k fold cv
from sklearn.model_selection import KFold, cross_val_score
kf = KFold(n_splits=10) #define number of splits
kf.get_n_splits(X) #to check how many splits will be done.
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
clf = LinearDiscriminantAnalysis() #select the model for train, test in kf.split(X, y):
for train, test in kf.split(X, y):
y_pred_prob = clf.fit(X[train], y[train]).predict_proba(X[test])
y_pred_class = clf.predict(X[test])
print(y_pred_class)
进行测试:
[1 0 1 0 1 0 0 1 1 0 0 1 1 1 1 0 0 0 0 0 0 0 1 0 1 0 1 0 1 0 0 1 0 0 0 0 1
0 0 1 1 1 0 1 1 1 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0
0 0 0]
[0 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0
1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0
0 1 1]
[1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 1 1 0 0 0 0
0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 1 1 0 1 0 0 0 0 1 1 0 1 0 0 0 1
1 0 1]
[1 0 0 0 1 1 1 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 1 0 0 1 1
0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 1 0 0 0 0 0 1 0 1 1 0 0 0 0 1 0 0 1 0
0 1 0]
[0 0 0 0 0 0 1 0 0 1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 1 1 0 1 1 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 1 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0
0 0 0]
[0 0 0 1 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 1 1 0 0 1 0 0 0 0 1 0 0 0 0
0 0 1 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 1
1 0 0]
[0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 1 0 0 0 0 0 1 0 0 1
1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1
0 0 0]
[0 0 0 0 0 0 1 1 0 1 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0
0 0 0 1 1 0 1 0 0 0 1 0 1 0 1 0 0 0 0 1 0 0 1 0 0 0 0 1 1 0 1 0 0 0 0 1 1
0 1 0]
[0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0
0 0 1 0 0 1 0 1 1 1 1 0 0 1 0 0 1 1 0 0 1 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 1
0 1]
[0 1 0 0 1 0 0 1 0 0 1 1 0 0 0 0 1 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0 0 1 0 0 1 1 0 1 0 1 1 1 0 0 1 1 0 0 0 0 1 0 1 0 0 0 0
0 0]
结果:
Bool