I'm working on a CNN and need to grab some images from URI's in a json file but keep them associated with the corresponding ids. I have a json file that looks something like this. I want to iterate through each product and extract 'id' and from 'image_uris' the "large" uri.
[{
"product_type": "widget",
"id": "1744556-ghh56h-4633",
"manufacture_id": "AAB4567",
"store_ids": [416835, 456145],
"name": "Best Widget",
"origin": "US",
"manufactured": "2018-08-26",
"uri": "https://bobswidgets.com/best_widget",
"image_uris": {
"small": "https://bobswidgets.com/small/best_widget_sm.jpg",
"normal": "https://bobswidgets.com/medium/best_widget_md.jpg",
"large": "https://bobswidgets.com/large/best_widget_lg.jpg",
},
"manufacture_cost": "12.50",
},
{
"product_type": "widget",
"id": "0956786-dje596-3904",
"manufacture_id": "BCD13D",
"store_ids": [014329, 40123],
"name": "Best Widget2",
"origin": "US",
"manufactured": "2018-10-03",
"uri": "https://bobswidgets.com/best_widget_2",
"image_uris": {
"small": "https://bobswidgets.com/small/best_widget2_sm.jpg",
"normal": "https://bobswidgets.com/medium/best_widget2_md.jpg",
"large": "https://bobswidgets.com/large/best_widget2_lg.jpg",
},
"manufacture_cost": "13.33",
}]
I then want to put them into their own dictionary like this. At least this is what I think I want to do unless there is a better idea:
[{"1744556-ghh56h-4633" : "https://bobswidgets.com/large/best_widget_lg.jpg"}, {"0956786-dje596-3904", "https://bobswidgets.com/large/best_widget2_lg.jpg"}]
My endgame would be to grab the images at those URI's and save them with the 'id' as the image name like this:
1744556-ghh56h-4633_lg.jpg
0956786-dje596-3904_lg.jpg
Eventually these images will be used for CNN as I mentioned earlier. When the image is recognized a lookup can be performed and return all the other values from the json file.
So far here is the code I've been using to extract the data I want. It grabs the 'id' fine but it grabs all of the image uris. I only want the 'large' uri.
import ujson as json
with open('product.json', 'r') as f:
prod_txt = f.read()
prod_dict = json.loads(prod_txt)
id = []
uris = []
for dictionary in prod_dict:
id.append(list(dictionary.values())[1])
if isinstance(dictionary, dict):
uris.append(list(dictionary.values())[8])
I've made various attempts to single out the 'large' uri without success Not really sure how to do it with a nested dictionary without throwing an error. I'm sure it is something simple but I'm still an amateur coder.
答案 0 :(得分:1)
使用list
理解,可以很简单地完成
In [106]: img_ids = [{d['id']: d['image_uris']['large']} for d in prod_dict]
In [107]: img_ids
Out[107]:
[{'1744556-ghh56h-4633': 'https://bobswidgets.com/large/best_widget_lg.jpg'},
{'0956786-dje596-3904': 'https://bobswidgets.com/large/best_widget2_lg.jpg'}]
请注意,这假设dict
中的每个list
中始终有id
和large
中image_uris
的值。如果不存在,您将得到一个KeyError
在这种情况下,您将必须像这样利用dict.get
# Adding new entry without 'image_uris' dict
In [110]: prod_dict.append({'id': 'new_id'})
In [111]: img_ids = [{d['id']: d.get('image_uris', {}).get('large', 'N/A')} for d in prod_dict]
In [112]: img_ids
Out[112]:
[{'1744556-ghh56h-4633': 'https://bobswidgets.com/large/best_widget_lg.jpg'},
{'0956786-dje596-3904': 'https://bobswidgets.com/large/best_widget2_lg.jpg'},
{'new_id': 'N/A'}]
答案 1 :(得分:0)
您对product.json
文件的编辑仍然无法使其成为有效的JSON,因此我改用了以下内容,即:
[
{
"product_type": "widget",
"id": "1744556-ghh56h-4633",
"manufacture_id": "AAB4567",
"store_ids": [
416835,
456145
],
"name": "Best Widget",
"origin": "US",
"manufactured": "2018-08-26",
"uri": "https://bobswidgets.com/best_widget",
"image_uris": {
"small": "https://bobswidgets.com/small/best_widget_sm.jpg",
"normal": "https://bobswidgets.com/medium/best_widget_md.jpg",
"large": "https://bobswidgets.com/large/best_widget_lg.jpg"
},
"manufacture_cost": "12.50"
},
{
"product_type": "widget",
"id": "0956786-dje596-3904",
"manufacture_id": "BCD13D",
"store_ids": [
"014329",
"40123"
],
"name": "Best Widget2",
"origin": "US",
"manufactured": "2018-10-03",
"uri": "https://bobswidgets.com/best_widget_2",
"image_uris": {
"small": "https://bobswidgets.com/small/best_widget2_sm.jpg",
"normal": "https://bobswidgets.com/medium/best_widget2_md.jpg",
"large": "https://bobswidgets.com/large/best_widget2_lg.jpg"
},
"manufacture_cost": "13.33"
}
]
因此,忽略该假设并假设您自己可以进行此操作,则可以使用名为dictionary display的东西来创建想要的字典,该东西与list comprehension非常相似。
import json
from pprint import pprint
filename = 'product.json'
with open(filename, 'r') as f:
prod_txt = f.read()
prod_list = json.loads(prod_txt)
result_dict = {product['id']: product['image_uris']['large']
for product in prod_list}
pprint(result_dict)
输出:
{'0956786-dje596-3904': 'https://bobswidgets.com/large/best_widget2_lg.jpg',
'1744556-ghh56h-4633': 'https://bobswidgets.com/large/best_widget_lg.jpg'}