Keras ImageDataGenerator flow_from_dataframe()找到0张图片

时间:2020-05-14 15:49:19

标签: python tensorflow keras

我正在使用带有tensorflow-gpu 2.1的jupyter-notebook。

当我尝试将数据输入模型时,它给我一个错误。一些图像属于两个或多个不同的类别。由于这是一个多标签任务,因此我需要自己定义类。

我正在使用的路径:

test_size=0.2
img_rows, img_cols, channels = 224,224,3 
batch_size = 1

#Specify paths
df_path = '/home/Erdal.Genc/covid_work/dset_preprocessing/NIH/NIH_df_ed.csv'
img_path = '/mnt/dsets/ChestXrays/NIH/images'
outputfolder = '/home/Erdal.Genc/covid_work/image_analysis'

拆分火车并进行测试:

 #import dataframe
NIH_df = pd.read_csv(df_path, low_memory=False, dtype=str)

#Split into test and training data
train_df, test_df = train_test_split(NIH_df, test_size=test_size)
print("Train and test data>>", len(train_df),len(test_df))

class_list=["No Finding", "Edema", "Atelectasis", "Consolidation", "Infiltration", "Effusion", 
            "Hernia", "Pneumothorax", "Pneumonia", "Mass", "Nodule", "Emphysema", 
            "Pleural_Thickening", "Cardiomegaly", "Fibrosis"]

训练和测试数据>> 89696 22424

ImageDatagenerator称为:

#Create training array
#On the fly with keras flow_from_dataframe 
train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator=train_datagen.flow_from_dataframe(
dataframe=train_df,
directory='/mnt/dsets/ChestXrays/NIH/images', 
x_col="Image Index",
y_col='labels',
has_ext=True,
batch_size=batch_size,
seed=42,
class_mode='categorical',
classes=class_list,
#color_mode = 'grayscale',
target_size=(img_rows, img_cols))

valid_generator=test_datagen.flow_from_dataframe(
dataframe=test_df,
directory=img_path,
x_col="Image Index",
y_col='labels',
#subset="validation",
batch_size=1,#batch_size,
seed=42,
class_mode='categorical',
classes=class_list,
#color_mode = 'grayscale',
target_size=(img_rows, img_cols))

STEP_SIZE_TRAIN=train_generator.n//train_generator.batch_size
STEP_SIZE_VALID=valid_generator.n//valid_generator.batch_size

结果:

*>找到0个经过验证的图像文件名,它们属于15个类。

找到了0个经过验证的图像文件名,它们属于15个类。*

当我这样评论“ classes = class_list”时:

train_generator=train_datagen.flow_from_dataframe(
    dataframe=train_df,
    directory='/mnt/dsets/ChestXrays/NIH/images', 
    x_col="Image Index",
    y_col='labels',
    has_ext=True,
    batch_size=batch_size,
    seed=42,
    class_mode='categorical',
    #classes=class_list,
    #color_mode = 'grayscale',
    target_size=(img_rows, img_cols))

valid_generator=test_datagen.flow_from_dataframe(
dataframe=test_df,
directory=img_path,
x_col="Image Index",
y_col='labels',
#subset="validation",
batch_size=1,#batch_size,
seed=42,
class_mode='categorical',
#classes=class_list,
#color_mode = 'grayscale',
target_size=(img_rows, img_cols))

它找到图像但找不到正确的类:

*>找到了属于769类的89695个经过验证的图像文件名。找到

22424验证的图像文件名属于426个类别。*

有什么办法吗?

谢谢!

0 个答案:

没有答案