Pybrain中的交叉验证

时间:2015-03-11 14:43:24

标签: python cross-validation pybrain

我试图找出在pybrain中进行5倍交叉验证的正确方法。我查看了他们的文档,但这并没有帮助。我在网上找到了以下两个版本的代码:

在问题here中找到了这个。

net = pybrain.tools.shortcuts.buildNetwork(5, 8, 1)
trainer = BackpropTrainer(net, ds)
evaluation = ModuleValidator.classificationPerformance(trainer.module, ds)
validator = CrossValidator(trainer=trainer, dataset=trainer.ds, n_folds=5, valfunc=evaluation)
print(validator.validate())
  

错误:
  evaluation = ModuleValidator.classificationPerformance(trainer.module,ds)

     

文件" ... / pybrain / tools / validation.py",第168行,在classificationPerformancedataset中)

     

文件" ... / pybrain / tools / validation.py",第204行,在验证中       return valfunc(output,target)

     

文件" ... / pybrain / tools / validation.py",第33行,分类性能       return float(n_correct)/ float(len(output))

     

TypeError:只能将length-1数组转换为Python标量

第二个here

  modval = ModuleValidator()
  for i in range(1000):
      trainer.trainEpochs(1)
      trainer.trainOnDataset(dataset=trndata)
      cv = CrossValidator( trainer, trndata, n_folds=5, valfunc=modval.MSE )
      print "MSE %f @ %i" %( cv.validate(), i )
  

错误 -       trainer.train()

     

文件" ... / rprop.py",第43行,列车       对于self.ds._provideSequences()中的seq:

     

AttributeError:' NoneType'对象没有属性' _provideSequences'

我去了源代码试图找出错误原因,但无法弄清楚我需要改变什么。任何帮助表示赞赏。

当我通过简单地将数据集分成3个部分(培训,验证和测试)来运行我的代码时,它运行良好。只有在我尝试实现k-fold交叉验证时,我才会收到这些错误。

1 个答案:

答案 0 :(得分:1)

这似乎对我有用:

import numpy as np

from processdata import process_data
from pybrain.datasets import ClassificationDataSet
from pybrain.datasets import SupervisedDataSet
from pybrain.structure import FeedForwardNetwork
from pybrain.structure import LinearLayer, SigmoidLayer
from pybrain.structure import FullConnection
from pybrain.supervised.trainers import BackpropTrainer

n=FeedForwardNetwork()

#Define Layers
inLayer= LinearLayer(200)
hiddenLayer= SigmoidLayer(100)
outLayer = LinearLayer(1)

#Add layers to the neural net module
n.addInputModule(inLayer)
n.addModule(hiddenLayer)
n.addOutputModule(outLayer)

#Define Connections
in_to_hidden = FullConnection(inLayer, hiddenLayer)
hidden_to_out = FullConnection(hiddenLayer, outLayer)

#add connections to the module
n.addConnection(in_to_hidden)
n.addConnection(hidden_to_out)
#make ready
n.sortModules()

#Define Trainer
trainer = BackpropTrainer( n, dataset=ds, momentum=0.1, verbose=True, weightdecay=0.005)

#perform crossvalidation
from pyBrain.tools.validation import CrossValidator
cv=CrossValidator(trainer=trainer, dataset=ds, n_folds=5) #creates a crossvalidator instance
CrossValidator.validate(cv) #calls the validate() function in CrossValidator to return results

它应输出每个折叠的错误。