如何使用整个训练数据计算统计信息?我想使用从训练集中计算出的统计数据来归一化测试数据。这是函数
[注意]-胸部X射线图像数据
def get_test_and_valid_generator(valid_df, test_df, train_df, image_dir, x_col, y_cols, sample_size=100, batch_size=64, seed=1, target_w = 224, target_h = 224):
print("getting train and valid generators...")
raw_train_generator = ImageDataGenerator().flow_from_dataframe(
dataframe=train_df,
directory=IMAGE_DIR,
x_col="Image",
y_col=labels,
class_mode="raw",
batch_size=sample_size,
shuffle=True,
target_size=(target_w, target_h))
batch = raw_train_generator.next()
data_sample = batch[0]
image_generator = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization= True)
image_generator.fit(data_sample)
valid_generator = image_generator.flow_from_dataframe(
dataframe=valid_df,
directory=image_dir,
x_col=x_col,
y_col=y_cols,
class_mode="raw",
batch_size=batch_size,
shuffle=False,
seed=seed,
target_size=(target_w,target_h))
test_generator = image_generator.flow_from_dataframe(
dataframe=test_df,
directory=image_dir,
x_col=x_col,
y_col=y_cols,
class_mode="raw",
batch_size=batch_size,
shuffle=False,
seed=seed,
target_size=(target_w,target_h))
return valid_generator, test_generator
我的训练数据大小为78115,测试大小为25595,瓦尔大小为8408。
有人可以帮助我吗?还有其他方法吗?强调文本