如何正确建立特征矩阵

时间:2019-06-28 07:07:37

标签: python python-3.x machine-learning scikit-learn

我正在尝试构建特征矩阵和标签矢量,但是会遇到不同的错误和错误。

我看着这里: Build the feature matrix and label vector

但是在我的示例中仍然有错误。

我有9000张图像(每个标签900张图像,存储在不同的子文件夹中) 我正在尝试构建特征矩阵和标签矢量,如下所示:

TRAIN_IMAGES_DIR = "C:/MachineLearning/ImageTraining/idenprof-jpg/idenprof/train/"
NUM_OF_LABELS = 10
NUM_OF_FILES_PER_LABEL = 900

# get image size
IMAGE_LEN = 0
for file in glob.glob(TRAIN_IMAGES_DIR + "**/*.jpg", recursive=True):   
    image = cv2.imread(file) 
    IMAGE_LEN = len(image)  
    break


train_labels = np.zeros(NUM_OF_FILES_PER_LABEL*NUM_OF_LABELS)
features_matrix = np.zeros((NUM_OF_FILES_PER_LABEL,IMAGE_LEN))

fileNum = 0
label = 0

for file in glob.glob(TRAIN_IMAGES_DIR + "**/*.jpg", recursive=True):   
    image = cv2.imread(file) 
    n_samples = len(image)  
    feature = image.reshape((n_samples, -1))    

    train_labels[fileNum] = label
    features_matrix[fileNum] = np.copy(feature) 

    fileNum = fileNum + 1
    if (0 == (fileNum % 900)):
        label = label + 1   

但是出现错误:

Traceback (most recent call last):
  File "myImageClassifier.py", line 40, in <module>
    features_matrix[fileNum] = np.copy(feature)
ValueError: could not broadcast input array from shape (224,672) into shape (224)

我尝试了不同的方法来构建特征矩阵,但没有成功。

我该怎么办?

(我正在使用python 3.7和sklearn

0 个答案:

没有答案