请帮助。
file.json
[
{"fullname": "mona", "phones": [{"phone": "21323131"}], "areas": [{"area": "Texas"}, {"area": "New York"}] },
{"fullname": "joni", "phones": [{"phone": "546465464"},{"phone": "45345353"}], "areas": [{"area": "California"},{"area": "San Jose"}] }
]
我有这样的数据框
import pandas as pd
df = pd.read_json('file.json')
print(df.head(2))
Output:
fullname phones Areas
mona [{'phone': '21323131'}] [{'area': 'Texas'}, {'area': 'New York'}]
joni [{'phone': '546465464'},{'phone': '45345353'}] [{'area': 'California'},{'area': 'San Jose'}]
如何提取像这样的数据框?
fullname phone Areas
mona 21323131 Texas, New York
joni 546465464 California, San Jose
joni 45345353 California, San Jose
答案 0 :(得分:2)
让我们一起尝试explode
和Series.str.get
:
s = df['areas'].explode().str.get('area').groupby(level=0).agg(', '.join)
d = df.explode('phones').assign(areas=s, phones=lambda x: x['phones'].str.get('phone'))
print(d)
fullname phones areas
0 mona 21323131 Texas, New York
1 joni 546465464 California, San Jose
1 joni 45345353 California, San Jose
答案 1 :(得分:1)
这是一个可能的解决方案:
import json
import pandas as pd
with open('file.json') as f:
data = json.load(f)
exploded_data = [{'fullname': x['fullname'],
'phone': p['phone'],
'area': a['area']}
for x in data for p, a in zip(x['phones'], x['areas'])]
df = pd.DataFrame(exploded_data)