我从Pocket API中获得了一些数据,而得到的名为 list 的JSON中有一些嵌套的JSON。下面的示例
{'complete': 1,
'error': None,
'list': {'1992211110': {'authors': {'8683682': {'author_id': '8683682',
'item_id': '1992211110',
'name': 'Robert Kuttner',
'url': 'http://www.nybooks.com/contributors/robert-kuttner/'}},
'excerpt': 'What a splendid era this was going to be, with one remaining superpower spreading capitalism and liberal democracy around the world. Instead, democracy and capitalism seem increasingly incompatible.',
'favorite': '0',
'given_title': '',
'given_url': 'http://nyrevinc.cmail20.com/t/y-l-klpdut-jduhlyklkl-d/',
'has_image': '0',
'has_video': '0',
'is_article': '1',
'is_index': '0',
'item_id': '1992211110',
'resolved_id': '1977788178',
'resolved_title': 'The Man from Red Vienna',
'resolved_url': 'http://www.nybooks.com/articles/2017/12/21/karl-polanyi-man-from-red-vienna/',
'sort_id': 6,
'status': '0',
'time_added': '1520132694',
'time_favorited': '0',
'time_read': '0',
'time_updated': '1520140351',
'word_count': '4009'},
我已经设法将整个结果放到数据框中,但是有一些看起来像一个名为 authors 的字典的嵌套?我已经设法将其拆分为带有索引的字典,但无法弄清楚如何将其转换为数据帧。以下示例作者:
{1: {'authors': {'8683682': {'author_id': '8683682',
'item_id': '1992211110',
'name': 'Robert Kuttner',
'url': 'http://www.nybooks.com/contributors/robert-kuttner/'}}},
2: {'authors': {'53525958': {'author_id': '53525958',
'item_id': '2086463428',
'name': 'Adam Tooze',
'url': 'http://www.nybooks.com/contributors/adam-tooze/'}}},
3: {'authors': {'3490600': {'author_id': '3490600',
'item_id': '2090266893',
'name': 'Adam Liaw',
'url': ''}}},
4: {'authors': {'75929933': {'author_id': '75929933',
'item_id': '2091894678',
'name': 'umair haque',
'url': 'https://eand.co/@umairh'}}},
5: {'authors': {'61177521': {'author_id': '61177521',
'item_id': '2092663780',
'name': 'Annalisa Merelli',
'url': 'https://qz.com/author/amerelliqz/'}}},
6: {'authors': {'52268529': {'author_id': '52268529',
'item_id': '2092922221',
'name': 'Aditya Chakrabortty',
'url': 'https://www.theguardian.com/profile/adityachakrabortty'}}},
7: {'authors': {'28083': {'author_id': '28083',
'item_id': '2096294305',
'name': 'Alana Semuels',
'url': ''}}},
8: {'authors': {'185472': {'author_id': '185472',
'item_id': '2097100251',
'name': 'TIM KREIDER',
'url': ''}}},
9: {'authors': {'2771923': {'author_id': '2771923',
'item_id': '2098788948',
'name': 'Richard Bernstein',
'url': 'http://www.nybooks.com/contributors/richard-bernstein/'}}},
10: {'authors': {'61111044': {'author_id': '61111044',
'item_id': '2102383890',
'name': 'Ephrat Livni',
'url': 'https://qz.com/author/livniqz/'}}}}
任何帮助非常感谢,我对python和pandas都很陌生。
答案 0 :(得分:0)
这是一个提案。您需要过滤辅助字典,以便将其摄取到数据框中。
input
是你的第二本字典。
authors_filtered = [v for v in zip(*[dict(item).values() for item in [input[i]['authors'] for i in input]])][0]
output = pd.DataFrame.from_dict(list(authors_filtered))