我有一个包含图像网址和标签的json文件。我正在尝试使用tf.keras.utils.get_file()
加载图像。这样,我一次只能下载一个图像。我在URL列表中添加了所有URL。然后,我尝试使用tf.keras.utils.get_file()
将图像从URL加载到新列表。为什么这不起作用?
Json文件结构
{"ID":"-","DataRow ID":"-","Labeled Data":"url is here!","Label":{"dorsaalinen kallistuskulma":[{"geometry":{"x":217,"y":269}},{"geometry":{"x":243,"y":263}}]},"Created By":"-","Project Name":"syvärit (testi)","Created At":"","Seconds to Label":42.286,"External ID":"image5 (2).png","Agreement":null,"Dataset Name":"ranne yhdistelmä","Reviews":[],"View Label":"-"},{"ID":"-","DataRow ID":"-","Labeled Data":"url is here","Label":{"dorsaalinen kallistuskulma":[{"geometry":{"x":217,"y":266}},{"geometry":{"x":243,"y":263}}]},"Created By":"-","Project Name":"syvärit (testi)","Created At":"","Seconds to Label":16.801,"External ID":"image5.png","Agreement":null,"Dataset Name":"ranne yhdistelmä","Reviews":[],"View Label":""}]
代码
import json
import tensorflow as tf
with open(filename) as f:
data = json.load(f)
# loading json data (url's)to list
url = []
for object in data:
url.append(object['Labeled Data'])
# loading the images
pictures =[]
for i in url:
pictures = tf.keras.utils.get_file('fname', i, untar=True)
# loads only one file and if I use pictures.append(tf.keras.utils.get_file) it doesn't download anything.
答案 0 :(得分:0)
您可以尝试使用gapcv
。它是用于预处理ML数据的框架。运作方式如下:
安装gapcv
:
pip install gapcv
从Images
导入vision
:
from gapcv.vision import Images
自从gapcv读取json以来,对json文件进行了一些修复:
请参见documentation:
[
{'label': 'cat', 'image': 'http://example.com/c1.jpg'},
{'label': 'dog', 'image': 'http://example.com/d1.jpg'},
...
]
运行此命令以创建一个new_label
键,并将标签名称提取到嵌套字典中
for image in json_file:
for key in list(image):
if key == 'Label':
image['new_label'] = list(image['Label'].keys())[0]
您将得到类似的东西:
'new_label': 'dorsaalinen kallistuskulma'
保存新的json_file
import json
with open('data.json', 'w') as outfile:
json.dump(json_file, outfile)
现在我们可以使用gapcv
从url下载和预处理图像了:
images = Images('my_new_file', 'data.json', config=['image_key=Labeled Data', 'label_key=new_label', 'store', 'resize=(224,224)'])
这将创建一个my_new_file.h5
文件,随时可以适合您的模型:)
您还可以使用生成器并将其用于keras:
# this will stream the data from the `my_new_file.h5` file so you don't overload your memory
images = Images(config=['stream'], augment=['flip=both', 'edge', 'zoom=0.3', 'denoise']) # augment if it's needed if not use just Images(config=['stream']), norm 1.0/255.0 by default.
images.load('my_new_file')
#Metadata
print('images train')
print('Time to load data set:', images.elapsed)
print('Number of images in data set:', images.count)
print('classes:', images.classes)
发电机:
images.split = 0.2
images.minibatch = 32
gap_generator = images.minibatch
X_test, Y_test = images.test
适合keras
模型:
model.fit_generator(generator=gap_generator,
validation_data=(X_test, Y_test),
epochs=epochs,
steps_per_epoch=steps_per_epoch)
为什么要使用gapcv?好了,模型拟合速度比ImageDataGenerator()
快两倍:)
colab中的示例