我试图找出在pybrain中进行5倍交叉验证的正确方法。我查看了他们的文档,但这并没有帮助。我在网上找到了以下两个版本的代码:
在问题here中找到了这个。
net = pybrain.tools.shortcuts.buildNetwork(5, 8, 1)
trainer = BackpropTrainer(net, ds)
evaluation = ModuleValidator.classificationPerformance(trainer.module, ds)
validator = CrossValidator(trainer=trainer, dataset=trainer.ds, n_folds=5, valfunc=evaluation)
print(validator.validate())
错误:
evaluation = ModuleValidator.classificationPerformance(trainer.module,ds)文件" ... / pybrain / tools / validation.py",第168行,在classificationPerformancedataset中)
文件" ... / pybrain / tools / validation.py",第204行,在验证中 return valfunc(output,target)
文件" ... / pybrain / tools / validation.py",第33行,分类性能 return float(n_correct)/ float(len(output))
TypeError:只能将length-1数组转换为Python标量
第二个here。
modval = ModuleValidator()
for i in range(1000):
trainer.trainEpochs(1)
trainer.trainOnDataset(dataset=trndata)
cv = CrossValidator( trainer, trndata, n_folds=5, valfunc=modval.MSE )
print "MSE %f @ %i" %( cv.validate(), i )
错误 - trainer.train()
文件" ... / rprop.py",第43行,列车 对于self.ds._provideSequences()中的seq:
AttributeError:' NoneType'对象没有属性' _provideSequences'
我去了源代码试图找出错误原因,但无法弄清楚我需要改变什么。任何帮助表示赞赏。
当我通过简单地将数据集分成3个部分(培训,验证和测试)来运行我的代码时,它运行良好。只有在我尝试实现k-fold交叉验证时,我才会收到这些错误。
答案 0 :(得分:1)
这似乎对我有用:
import numpy as np
from processdata import process_data
from pybrain.datasets import ClassificationDataSet
from pybrain.datasets import SupervisedDataSet
from pybrain.structure import FeedForwardNetwork
from pybrain.structure import LinearLayer, SigmoidLayer
from pybrain.structure import FullConnection
from pybrain.supervised.trainers import BackpropTrainer
n=FeedForwardNetwork()
#Define Layers
inLayer= LinearLayer(200)
hiddenLayer= SigmoidLayer(100)
outLayer = LinearLayer(1)
#Add layers to the neural net module
n.addInputModule(inLayer)
n.addModule(hiddenLayer)
n.addOutputModule(outLayer)
#Define Connections
in_to_hidden = FullConnection(inLayer, hiddenLayer)
hidden_to_out = FullConnection(hiddenLayer, outLayer)
#add connections to the module
n.addConnection(in_to_hidden)
n.addConnection(hidden_to_out)
#make ready
n.sortModules()
#Define Trainer
trainer = BackpropTrainer( n, dataset=ds, momentum=0.1, verbose=True, weightdecay=0.005)
#perform crossvalidation
from pyBrain.tools.validation import CrossValidator
cv=CrossValidator(trainer=trainer, dataset=ds, n_folds=5) #creates a crossvalidator instance
CrossValidator.validate(cv) #calls the validate() function in CrossValidator to return results
它应输出每个折叠的错误。