I have a classification problem where the class of interest is only 7% of the dataset, and the entire population is roughly 1200 observations.
I understand that ImageDataGenerator
from Keras helps with augmenting the data to increase the number of observations before training the model, however is it possible to augment only one class, as in add noise, blur or perform transformations only on the minority class?
答案 0 :(得分:1)
您可以尝试使用fit()函数中的class_weight参数进行平衡,该函数将字典映射到权重值。您甚至可以使用sklearn来计算适当的类权重。请参阅此处的PScs答案:https://datascience.stackexchange.com/questions/13490/how-to-set-class-weights-for-imbalanced-classes-in-keras
或者您可以将Keras ImageDateGenerator与flow_from_directory()一起使用,并使用save_to_dir参数将图像扩充运行保存到目录,从而生成更多代表性不足的类的示例:https://keras.io/preprocessing/image/#imagedatagenerator
对于那个虚拟运行,您只提供您想要更多样本的类的样本。
然后,您将平衡的培训和验证数据用于实际培训。
答案 1 :(得分:0)
There is a machine learning toolkit that allows you to perform augmenting on the images including Transformations, Zoom/Strech, Noise and blurring.
The Image Augmentor can be found here: https://github.com/codebox/image_augmentor