在PyStruct中拟合SSVM模型时的IndexError

时间:2015-02-21 21:52:40

标签: python numpy machine-learning

我正在使用pystruct Python模块来对讨论话题中的帖子进行分类时出现结构化学习问题,而且在争取训练OneSlackSSVM以便与{一起使用}时,我遇到了一个问题{1}}。我正在关注OCR example from the docs,但似乎无法在SSVM上调用LinearChainCRF方法。这是我得到的错误:

.fit()

以下是我写的代码。我已经厌倦了在文档示例中构建数据,其中整体数据结构为Traceback (most recent call last): File "<ipython-input-47-da804d135818>", line 1, in <module> ssvm.fit(X_train, y_train) File "/Users/kylefth/anaconda/lib/python2.7/site- packages/pystruct/learners/one_slack_ssvm.py", line 429, in fit joint_feature_gt = self.model.batch_joint_feature(X, Y) File "/Users/kylefth/anaconda/lib/python2.7/site- packages/pystruct/models/base.py", line 40, in batch_joint_feature joint_feature_ += self.joint_feature(x, y) File "/Users/kylefth/anaconda/lib/python2.7/site- packages/pystruct/models/graph_crf.py", line 197, in joint_feature unary_marginals[gx, y] = 1 IndexError: index 7 is out of bounds for axis 1 with size 7 ,其中包含dictdatalabels的键。

folds

在尝试拟合模型后,我得到了上述错误。 from pystruct.models import LinearChainCRF from pystruct.learners import OneSlackSSVM # Printing out keys of overall data structure print threads.keys() >>> ['folds', 'labels', 'data'] # Creating instances of models crf = LinearChainCRF() ssvm = OneSlackSSVM(model=crf) # Splitting up data into training and test sets as in example X, y, folds = threads['data'], threads['labels'], threads['folds'] X_train, X_test = X[folds == 1], X[folds != 1] y_train, y_test = y[folds == 1], y[folds != 1] # Print out dimensions of first element in data and labels print X[0].shape, y[0].shape >>> (8, 211), (8,) # Fitting the ssvm model ssvm.fit(X_train, y_train) >>> see error above X_trainX_testy_train的所有实例都有211列,所有标签维度似乎都与其对应的培训和测试数据相匹配。任何帮助将不胜感激。

1 个答案:

答案 0 :(得分:2)

我认为你所做的一切都是对的,这是https://github.com/pystruct/pystruct/issues/114。 您的标签需要从0开始到n_labels。我认为你从1开始。