scikit-learn:预期的2D数组,取而代之的是1D数组

时间:2019-01-11 09:40:55

标签: python machine-learning scikit-learn

我目前正在尝试训练我的数据集,并且目前正在尝试按照链接中的步骤进行操作。 Link here

我不断收到类似的错误

ValueError: Expected 2D array, got 1D array instead: array=[0. 0. 0. and so on..]. 

这是我正在尝试使用sci-kit学习测试和训练的代码:

import numpy as np
from sklearn.svm import LinearSVC
import os
import cv2
import joblib

# Generate training set
TRAIN_PATH = 'Try/'
list_folder = os.listdir(TRAIN_PATH)
trainset = []
for folder in list_folder:
    flist = os.listdir(os.path.join(TRAIN_PATH, folder))
    for f in flist:
        im = cv2.imread(os.path.join(TRAIN_PATH, folder, f))
        im = cv2.cvtColor(im, cv2.COLOR_RGB2GRAY )
        im = cv2.resize(im, (36,36))
        trainset.append(im)
# Labeling for trainset
train_label = []
for i in range(0,1): #in the dataset i currently have a 0 folder and a 1 folder
    temp = 400*[i] #400 images in 1 folder
    train_label += temp

# Generate testing set
TEST_PATH = 'Test/'
list_folder = os.listdir(TEST_PATH)
testset = []
test_label = []
for folder in list_folder:
    flist = os.listdir(os.path.join(TEST_PATH, folder))
    for f in flist:
        im = cv2.imread(os.path.join(TEST_PATH, folder, f))
        im = cv2.cvtColor(im, cv2.COLOR_RGB2GRAY )
        im = cv2.resize(im, (36,36))
        testset.append(im)
        test_label.append(folder)
trainset = np.reshape(trainset, (800, -1)) #800 because total number of images

# Create an linear SVM object
clf = LinearSVC()

# Perform the training
clf.fit(train_label, testset)
print("Training finished successfully")

# Testing
testset = np.reshape(testset, (len(testset), -1))
y = clf.predict(testset)
print("Testing accuracy: " + str(clf.score(testset, test_label)))

这是错误消息:

 Traceback (most recent call last):   File
 "C:\Users\hp\AppData\Local\Programs\Python\Python37-32\Thesis\Test.py",
 line 43, in <module>
     clf.fit(train_label, testset)   File "C:\Users\hp\AppData\Local\Programs\Python\Python37-32\lib\site-packages\sklearn\svm\classes.py",
 line 229, in fit
     accept_large_sparse=False)   File "C:\Users\hp\AppData\Local\Programs\Python\Python37-32\lib\site-packages\sklearn\utils\validation.py",
 line 756, in check_X_y
     estimator=estimator)   File "C:\Users\hp\AppData\Local\Programs\Python\Python37-32\lib\site-packages\sklearn\utils\validation.py",
 line 552, in check_array
     "if it contains a single sample.".format(array)) ValueError: Expected 2D array, got 1D array instead: array=[0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]. Reshape your data either using array.reshape(-1, 1) if your data has a single feature or
 array.reshape(1, -1) if it contains a single sample.

1 个答案:

答案 0 :(得分:2)

很难说出这么少的信息,但可能与clf.fit()正在接收的参数有关。您似乎在实际训练数据train_label之前将标签X输入了标签。

如果您查看文档,则顺序如下:

  

fit(X,y [,sample_weight])根据给定的训练数据拟合SVM模型。

改为使用:

clf.fit(testset, train_label)