在python 2.6.6中运行我的代码时出现此错误。在Python 3.4.3中运行时没有问题
usr/lib64/python2.6/site-packages/sklearn/feature_selection/univariate_selection.py:319: UserWarning: Duplicate scores. Result may depend on feature ordering.There are probably duplicate features, or you used a classification score for a regression task.
warn("Duplicate scores. Result may depend on feature ordering."
Traceback (most recent call last):
File "classification.py", line 31, in <module>
main()
File "classification.py", line 15, in main
tm.optimaltrain(conf)
File "/axp/gabm/npscnnct/dev/getThemes/textminer/textminer/classify.py", line 121, in optimaltrain
score = self.cv(X,y,model)
File "/axp/gabm/npscnnct/dev/getThemes/textminer/textminer/classify.py", line 140, in cv
skf = cross_validation.StratifiedKFold(y, n_folds=self.cv_folds, shuffle=True)
TypeError: __init__() got an unexpected keyword argument 'shuffle'
代码:
def cv(self, X, y, model):
y_true = []
y_pred = []
skf = cross_validation.StratifiedKFold(y, n_folds=self.cv_folds, shuffle=True)
for train_index, test_index in skf:
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
model.fit(X_train, y_train)
y_pred += list(model.predict(X_test))
y_true += list(y_test)
但是当我从代码中删除Shuffle=True
时,其运行正常。我使用的模块是scipy 0.11.0,nltk 2.0.1,sklearn 0.14.1
请指教。 感谢
答案 0 :(得分:2)
以下是for(j<=1; j<=i ; j=2*j);// performs no action
版本0.14
)的来源:https://github.com/scikit-learn/scikit-learn/blob/0.14.X/sklearn/cross_validation.py#L391
我已链接到sklearn
上 init 的实际行 - 这表明没有StratifiedKFold
个关键字参数。
升级到具有shuffle
的v 0.15
(如此处所示:https://github.com/scikit-learn/scikit-learn/blob/0.15.X/sklearn/cross_validation.py#L399)。
我将假设shuffle
上的sklearn
版本为Py3
?
答案 1 :(得分:2)
在sklearn 0.14中,cross_validation.StratifiedKFold()
没有关键字参数shuffle
。显然,它只是在以后的版本中添加(实际上是0.15)。
您可以在分层之前更新sklearn或自行更改输入(例如,使用random.shuffle()
)。