我是机器学习的新手,并且正在使用一个小的数据集来训练CNN。我已经尝试过数据增强,但是不会超过75%的准确性。我想知道k折交叉验证是否可能是下一步行动。但是我不知道如何为它编写代码。
到目前为止,这是我的代码:
数据目录:
train_dir = '../input/train_images'
train_labels = pd.read_csv('../input/train.csv')
train_labels['diagnosis'] = train_labels['diagnosis'].astype(str)
train_labels["id_code"]=train_labels["id_code"].apply(lambda x:x+".png")
test_dir = '../input/test_images'
test_labels = '../input/test.csv'
预处理:
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale=1./255,)
train_generator = train_datagen.flow_from_dataframe(
train_labels,
directory= train_dir,
x_col='id_code', y_col='diagnosis',
target_size=(150, 150),
color_mode='grayscale',
class_mode='categorical',
batch_size=32,
shuffle=True,)
模型:
def get_model():
model = models.Sequential()
model.add(layers.Conv2D(32, (3,3), activation='relu', input_shape=(150,150,3)))
model.add(layers.MaxPooling2D(2,2))
model.add(layers.Conv2D(64, (3,3), activation='relu'))
model.add(layers.MaxPooling2D(2,2))
model.add(layers.Conv2D(128, (3,3), activation='relu'))
model.add(layers.Conv2D(128, (3,3), activation='relu'))
model.add(layers.MaxPooling2D(2,2))
model.add(layers.Conv2D(128, (3,3), activation='relu'))
model.add(layers.Conv2D(128, (3,3), activation='relu'))
model.add(layers.MaxPooling2D(2,2))
model.add(layers.Flatten())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(5, activation='softmax'))
#Compile your model
model.compile(loss='categorical_crossentropy',
optimizer=optimizers.Adam(),
metrics=['acc'])
return model
可以在以下链接中找到数据集: https://www.kaggle.com/c/aptos2019-blindness-detection/data