Tumblr Python存储来自JSON的帖子照片网址

时间:2017-01-01 17:33:57

标签: python arrays json tumblr

我试图找出如何将多个url链接存储到python数组键或任何其他方法,只要我可以存储多个url链接。

在使用的数据中,每个帖子可能包含也可能不包含多个“照片”图像对象(使用JSON),因此我想存储每个帖子图像对象。

E.g。来自https://www.tumblr.com/docs/en/api/v2的数据

"posts": [
         {
            "blog_name": "derekg",
            "id": 7431599279,
            "post_url": "http:\/\/derekg.org\/post\/7431599279",
            "type": "photo",
            "date": "2011-07-09 22:09:47 GMT",
            "timestamp": 1310249387,
            "format": "html",
            "reblog_key": "749amggU",
            "tags": [],
            "note_count": 18,
            "caption": "<p>my arm is getting tired.<\/p>",
            "photos": [
               {
                  "caption": "",
                  "alt_sizes": [
                     {
                        "width": 1280,
                        "height": 722,
                        "url": "http:\/\/derekg.org\/photo\/1280\/7431599279\/1\/
                           tumblr_lo36wbWqqq1qanqww"
                     },
                     {
                        "width": 500,
                        "height": 282,
                        "url": "http:\/\/30.media.tumblr.com\/
                           tumblr_lo36wbWqqq1qanqwwo1_500.jpg"
                     },
                     {
                        "width": 400,
                        "height": 225,
                        "url": "http:\/\/29.media.tumblr.com\/
                           tumblr_lo36wbWqqq1qanqwwo1_400.jpg"
                     },
                     {
                        "width": 250,
                        "height": 141,
                        "url": "http:\/\/26.media.tumblr.com\/
                           tumblr_lo36wbWqqq1qanqwwo1_250.jpg"
                     },
                     {
                        "width": 100,
                        "height": 56,
                        "url": "http:\/\/24.media.tumblr.com\/
                           tumblr_lo36wbWqqq1qanqwwo1_100.jpg"
                     },
                     {
                        "width": 75,
                        "height": 75,
                        "url": "http:\/\/30.media.tumblr.com\/
                           tumblr_lo36wbWqqq1qanqwwo1_75sq.jpg"
                     }
                  ]
               }
            ]
         }
      ]

到目前为止我的python:

raw_json_data = requests.get('api.tumblr.com/v2/blog/{blog-identifier}/likes?api_key={key}')
data = raw_json_data.json()
data_format = data['response']['liked_posts']

number = 0

dat = [{} for i in range(len(data['response']['liked_posts']))]

for posts in data_format:
    #print(posts['blog_name'])
    #print(posts['timestamp'])
    g = 0
    dat[number]['blog_name'] = posts['blog_name']
    dat[number]['tags'] = posts['tags']
    dat[number]['timestamp'] = posts['timestamp']

    if len(posts['photos']) > 1:
    dat[number]['url'] = {}
    g = 0
    for g, u in range(len(posts['photos'])):
        dat[number]['url'][g] = u['alt_sizes'][0]['url']
        g += 1

    number += 1
with open(json_storage, 'w') as outputFile:
    json.dump(dat, outputFile)

我现在收到错误,因为它仍然没有存储到我的JSON文件中,并且所有帖子中现在都缺少关键字“url”

1 个答案:

答案 0 :(得分:0)

我没有看到minimal working example(我需要看JSON解析),但在我看来,你正试图在一个语句中创建多层字典键。由于依赖__setitem__,Python只允许您一次初始化一个键/索引。因此,您需要先初始化...['url'],然后才能分配到...['url'][g]

dat[number]['url'][g] = ...也可以被视为((dat[number])['url'])[g] = ...。您现在应该能够看到在分配发生之前您尝试执行的两个索引读取。

部分示例(伪代码!)使用dictionary comprehension

for posts in JSON_DATA:
    #print(posts['blog_name'])
    #print(posts['timestamp'])

    dat[number]['blog_name'] = posts['blog_name']
    dat[number]['tags'] = posts['tags']
    dat[number]['timestamp'] = posts['timestamp']
    g = 0
    if len(posts['photos']) > 1:
        dat[number]['url'] = {k:v['alt_sizes'][0]['url'] for (k, v) in enumerate(posts['photos'])}

替代示例(更多伪代码)主要使用相同的for循环:

for posts in JSON_DATA:
    #print(posts['blog_name'])
    #print(posts['timestamp'])

    dat[number]['blog_name'] = posts['blog_name']
    dat[number]['tags'] = posts['tags']
    dat[number]['timestamp'] = posts['timestamp']
    g = 0
    if len(posts['photos']) > 1:
        dat[number]['url'] = {}
        for g, u in enumerate(posts['photos']):
            dat[number]['url'][g] = u['alt_sizes'][0]['url']
    number += 1