SVM的验证:ValueError:类的数量必须大于一; 1节课

时间:2018-12-15 15:57:30

标签: python-3.x validation numpy svm orange

我正在尝试建立一种新的验证算法来评估分类器的性能。但是一个星期以来,我碰到了我自己无法攀登的墙。每当我尝试评估像这样的2个标记数据

> ([[0.517, 0.118 | C2], [0.422, 0.745 | C2], [0.504, 0.422 | C2],
> [0.377, 0.520 | C2], [0.286, 0.636 | C1], [0.500, 0.066 | C2], [0.437,
> 0.639 | C2], [0.478, 0.138 | C2], [0.299, 0.140 | C1], [0.293, 0.275 | C1], [0.470, 0.246 | C2], [0.335, 0.405 | C1], [0.229, 0.563 | C1],
> [0.347, 0.295 | C1], [0.470, 0.176 | C2], [0.309, 0.069 | C1]])

发生此错误SVM失败并出现错误: ValueError:类数必须大于一;上了一堂课。

由于它会计算何时给出足够的数据(最多100个数据点),因此我的教授建议我必须以交替方式(C1,C2,C1,C2 ...)对数据进行排序。但是我认为问题在于失去了一个标签。我已经打印了np.uniqueself.data(训练集)和X(测试集)的Y标签,并且所有的pritty都一样(只是其他浮点数更多):

E.g for X
[0.         0.17289992 0.20676427 0.24686935 0.47190027 0.48798112
 0.62406178 0.63207783 0.65526534 1.        ]

我真的不知道该怎么办。 请原谅我不好的编码技能,我是心理学家而不是编码员。

这是我的代码

class IndependentValidation(Results):
"""
Independent Validation Testing
split data into test_data, train_data.
split test_data in n_splits = number of observations
---Loop---
for each n_split_test_data do:
train on train_data
predict and test on 1 observation = (1_split_test_data)
add used 1_split_test_data to train_data
(re)move used 1_split_test_data from test_data
do until test_data=0 / while test_data!=0   
Structure e.g = Leave One Out
"""
score_by_folds = False

def __init__(self, data, learners, store_data=False, store_models=False, preprocessor=None,
            callback=None, n_jobs=1, train_size=None, test_size=0.8, random_state=42):

    self.train_size = train_size
    self.test_size = test_size
    self.random_state = random_state

    super().__init__(data, learners=learners, store_data=store_data, store_models=store_models,
                     preprocessor=preprocessor, callback=callback, n_jobs=n_jobs)

def setup_indices(self, train_data, test_data):
    X, Y = skl.train_test_split(self.data, test_size=0.8, random_state=42, shuffle = True)
    train_data = np.array(range(_num_samples(X)))
    test_data = np.array(range(len(train_data), _num_samples(Y)))
    StratArr = []
    lngth = len(test_data)
    while lngth != 1:
        x, test_data = test_data[+0] , test_data[::-1]
        y, test_data = test_data[+0], test_data[:-1]
        z, test_data = test_data[+0], test_data[::-1]
        x = np.array([x])
        StratArr.insert(len(StratArr), ((train_data), (x)))
        train_data = np.concatenate((train_data,x), axis= 0)
        lngth = lngth - 1
    StratArr.insert(len(StratArr), ((train_data), (test_data)))
    self.indices = StratArr

def prepare_arrays(self, StratArr):
    # sped up version of super().prepare_arrays(data)
    self.row_indices = np.arange(len(StratArr))
    self.folds = self.row_indices
    self.actual = StratArr.Y.flatten()

0 个答案:

没有答案