我正在尝试使用具有30,000张图像的训练数据集来训练CNN。我相信我知道这是什么错误,我的笔记本电脑没有足够的内存来处理所有数据,但是对此的解决方案是什么?我检查数据类型为uint8,但错误为float64
预处理代码:
import numpy as np
import matplotlib.pyplot as plt
import os
import cv2
class_list = open("Class List", "r")
CATEGORIES = []
number = 1
for line in class_list:
line = line.strip()
CATEGORIES.append(line)
class_list.close()
DATADIR = r'C:\Users\steel\Downloads\Datasets\225 Bird Species\images'
for category in CATEGORIES:
path = os.path.join(DATADIR, category)
for img in os.listdir(path):
img_array = cv2.imread(os.path.join(path, img), cv2.IMREAD_COLOR)
DATADIR = r'C:\Users\steel\Downloads\Datasets\225 Bird Species\train'
training_data = []
def format_training_data():
for category in CATEGORIES:
path = os.path.join(DATADIR, category)
class_num = CATEGORIES.index(category)
for img in os.listdir(path):
try:
img_array = cv2.imread(os.path.join(path,img), cv2.IMREAD_COLOR)
training_data.append([img_array, class_num])
except Exception as e:
pass
format_training_data()
print(len(training_data))
import random
random.shuffle(training_data)
X = []
y = []
for features, label in training_data:
X.append(features)
y.append(label)
X = np.array(X).reshape(-1, 224, 224, 3)
y = np.array(y)
import pickle
pickle_out = open("X_SciFair.pickle", "wb")
pickle.dump(X, pickle_out)
pickle_out.close()
pickle_out = open("y_SciFair.pickle", "wb")
pickle.dump(y, pickle_out)
pickle_out.close()
发生错误的地方:
X = pickle.load(open('X_SciFair.pickle', 'rb'))
y = pickle.load(open('y_SciFair.pickle', 'rb'))
X = X / 255.0 <--
我的笔记本电脑具有8gb的RAM,但通常在任何给定时刻只有5块左右是免费的,而且我很确定自己不能再添加更多。 如果您需要更多信息来回答,我会尝试添加它
PS,我尝试将数组dtype更改为'float16',但将其更改为float32,但仍然较大(12.3 GiB)