我似乎无法使用json_normalize从嵌套的json中提取所需的所有元数据。请参阅下面的JSON。我正在尝试从内容节点检索标题(“某些书”),但只能成功地深入到内容。
例如:
circleci/node
产生
json_normalize(result_data, 'data', ['title', 'key',['group_dimensions','content']])
但是我仍然需要提取“标题”。但是要更深一层:
,date,units,group_dimensions.content,key,title
0,2019-03-17T00:00:00.000Z,0.0,"{u'key': u'1358883623', u'title': u'Some Book'}",143489,Czech Republic
1,2019-03-24T00:00:00.000Z,10.0,"{u'key': u'1358883623', u'title': u'Some Book'}",143489,Czech Republic
2,2019-03-31T00:00:00.000Z,13.0,"{u'key': u'1358883623', u'title': u'Some Book'}",143489,Czech Republic
3,2019-03-17T00:00:00.000Z,0.0,"{u'key': u'1358883623', u'title': u'Some Book'}",143487,Romania
产生错误: TypeError:顺序项目1:期望的字符串,找到列表
想法?
json_normalize(result_data, 'data', ['title', 'key',['group_dimensions',['content','title']]])
答案 0 :(得分:0)
您可以使用我制作的以下软件包。它将扩展它在DataFrame中找到的每个dict
填充。
import flat_table
# I am selecting group dimentions, key, metadata, and title.
df = pd.DataFrame(result_data).iloc[:,1:]
flat_table.normalize(df)
它将找到所有词典并展开为新列。
index store_front_ica.title store_front_ica.key content.title content.key key_x title_x title_y key_y
0 0 Czech Republic 143489 Some Book 123456789 143489 Czech Republic Czech Republic 143489
1 1 Romania 143487 Some Book 123456789 143487 Romania Romania 143487
该程序包还将列表扩展成行,这里是完整的df
。
index units date store_front_ica.title store_front_ica.key content.title content.key key_x title_x title_y key_y
0 0 0.0 2019-03-17T00:00:00.000Z Czech Republic 143489 Some Book 123456789 143489 Czech Republic Czech Republic 143489
1 0 10.0 2019-03-24T00:00:00.000Z Czech Republic 143489 Some Book 123456789 143489 Czech Republic Czech Republic 143489
2 0 13.0 2019-03-31T00:00:00.000Z Czech Republic 143489 Some Book 123456789 143489 Czech Republic Czech Republic 143489
3 1 0.0 2019-03-17T00:00:00.000Z Romania 143487 Some Book 123456789 143487 Romania Romania 143487
4 1 0.0 2019-03-24T00:00:00.000Z Romania 143487 Some Book 123456789 143487 Romania Romania 143487
5 1 200.0 2019-03-31T00:00:00.000Z Romania 143487 Some Book 123456789 143487 Romania Romania 143487
您可以尝试flat-table。