我想知道是否有一种简洁,pythonic的方式做到这一点
phone 0 {"brand":{"type":"android"},"names":[{"id":"1", "name":"a-1"},{"id":"2", "name":"a-2"}]} 1 {"brand":{"type":"iphone"},"names":[{"id":"3", "name":"i-1"},{"id":"4", "name":"i-2"}]}
我想将json字段扩展为数据字段,以获取此信息:
type id name 0 android 1 a-1 1 android 2 a-2 2 iphone 3 i-1 3 iphone 4 i-2 I have found a good solution:def parser_expand_json(data): keys = [] values = [] for key in data: keys.append(key) values.append(data.get(key)) return pd.Series(values, index=keys) # that is it def test(): data = [{'brand': {'type': 'android'}, 'names': [{'id': '1', 'name': 'a-1'}, {'id': '2', 'name': 'a-2'}]}, {'brand': {'type': 'iphone'}, 'names': [{'id': '3', 'name': 'i-1'}, {'id': '4', 'name': 'i-2'}]}] df = pd.DataFrame(data) # expand json list to N rows df = df.merge(df['names'].apply(pd.Series), right_index=True, left_index=True).drop('names', axis=1).melt( id_vars=['brand'], value_name='names').drop('variable', axis=1) """ brand names 0 {u'type': u'android'} {u'id': u'1', u'name': u'a-1'} 1 {u'type': u'iphone'} {u'id': u'3', u'name': u'i-1'} 2 {u'type': u'android'} {u'id': u'2', u'name': u'a-2'} 3 {u'type': u'iphone'} {u'id': u'4', u'name': u'i-2'} """ print df # expand json key to columns name df = pd.concat([df, df['brand'].apply(parser_expand_json), df['names'].apply(parser_expand_json)], axis=1).drop( ['brand', 'names'], axis=1) """ type id name 0 android 1 a-1 1 iphone 3 i-1 2 android 2 a-2 3 iphone 4 i-2 """ print df
答案 0 :(得分:0)
使用列表手动构建具有所需结构的新DataFrame的解决方案:
import pandas as pd
json = [
{"brand":{"type":"android"},"names":[{"id":"1", "name":"a-1"},{"id":"2", "name":"a-2"}]},
{"brand":{"type":"iphone"},"names":[{"id":"3", "name":"i-1"},{"id":"4", "name":"i-2"}]}
]
json_data = {'phone': json}
df_1 = pd.DataFrame(json_data)
type_list = []
id_list = []
name_list = []
for row in df_1.phone:
for item in row['names']:
type_list.append(row['brand']['type'])
id_list.append(item['id'])
name_list.append(item['name'])
data = {'type':type_list, 'id':id_list, 'name':name_list}
df_2 = pd.DataFrame(data)
要使用json_normalize()
,我们必须首先将json重组为所需的列结构。这种情况下的解决方案如下:
import pandas as pd
from pandas.io.json import json_normalize
json = [
{"brand":{"type":"android"},"names":[{"id":"1", "name":"a-1"},{"id":"2", "name":"a-2"}]},
{"brand":{"type":"iphone"},"names":[{"id":"3", "name":"i-1"},{"id":"4", "name":"i-2"}]}
]
json_mod = []
for row in json:
for item in row['names']:
json_mod.append({'type':row['brand']['type'],'id':item['id'],'name':item['name']})
df_3 = json_normalize(json_mod)
df_2
和df_3
都显示为:
id name type
0 1 a-1 android
1 2 a-2 android
2 3 i-1 iphone
3 4 i-2 iphone