我有一个图像数据集,其名称存储在CSV文件的第一列中,而目标标签存储在第二列中。我正在尝试使用flow_from_dataframe方法生成训练和验证数据。这就是我的代码的数据生成部分的样子-
datagen=ImageDataGenerator(rescale=1./255.,validation_split = 0.2)
#creating training generator
train_generator=datagen.flow_from_dataframe(
dataframe=train_data,
directory="Images/",
x_col="UID",
y_col="growth_stage",
subset="training",
batch_size=100,
seed=1,
shuffle=True,
class_mode="sparse",
target_size=(100,100))
# creating validation generator
val_generator=datagen.flow_from_dataframe(
dataframe=train_data,
directory="Images/",
x_col="UID",
y_col="growth_stage",
subset="validation",
batch_size=100,
seed=1,
shuffle = True,
class_mode="sparse",
target_size=(100,100))
这是输出-
Found 4888 validated image filenames belonging to 7 classes.
Found 1222 validated image filenames belonging to 7 classes.
现在,如果我尝试查看val_generator中的内容,则会发现只有两种类型的标签值。这是我用来访问存储在val_generator中的值的代码-
b = val_generator[0][1]
print(b)
给出结果-
[0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.
0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 0. 0. 1. 0. 0. 1. 0. 0.
0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 1. 0. 1. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 1. 0. 0. 0. 0. 0. 1. 1. 1. 0. 1. 0. 0. 0.
0. 0. 1. 0.]
根据我的理解,应该有7种(=类数)类型的值,而不是0和1?
另外,如果将validation_split增加到0.5,则print(b)
的结果为-
[0. 2. 1. 2. 0. 0. 0. 0. 1. 3. 1. 0. 0. 3. 3. 3. 3. 0. 1. 0. 2. 1. 0. 2.
2. 2. 1. 0. 1. 3. 0. 2. 1. 3. 0. 0. 1. 1. 2. 3. 0. 0. 0. 1. 0. 1. 3. 0.
1. 3. 2. 1. 1. 0. 2. 0. 0. 0. 2. 0. 1. 2. 0. 2. 3. 3. 1. 0. 0. 2. 0. 3.
1. 2. 0. 1. 1. 1. 1. 0. 3. 1. 1. 0. 1. 1. 2. 2. 0. 3. 1. 3. 2. 1. 2. 1.
0. 1. 2. 1.]
有人可以帮助我了解这里发生的事情以及如何解决此问题吗?