提取列熊猫数据框中的列表

时间:2020-10-24 15:05:25

标签: python pandas dataframe

请帮助。

file.json
[
{"fullname": "mona", "phones": [{"phone": "21323131"}], "areas": [{"area": "Texas"}, {"area": "New York"}] }, 
{"fullname": "joni", "phones": [{"phone": "546465464"},{"phone": "45345353"}], "areas": [{"area": "California"},{"area": "San Jose"}] }
]

我有这样的数据框

import pandas as pd

df = pd.read_json('file.json')

print(df.head(2))

Output:

fullname   phones                                          Areas
mona       [{'phone': '21323131'}]                         [{'area': 'Texas'}, {'area': 'New York'}]   
joni       [{'phone': '546465464'},{'phone': '45345353'}]  [{'area': 'California'},{'area': 'San Jose'}] 

如何提取像这样的数据框?

fullname   phone        Areas
mona       21323131     Texas, New York 
joni       546465464    California, San Jose
joni       45345353     California, San Jose

2 个答案:

答案 0 :(得分:2)

让我们一起尝试explodeSeries.str.get

s = df['areas'].explode().str.get('area').groupby(level=0).agg(', '.join)
d = df.explode('phones').assign(areas=s, phones=lambda x: x['phones'].str.get('phone'))

print(d)

  fullname     phones                 areas
0     mona   21323131       Texas, New York
1     joni  546465464  California, San Jose
1     joni   45345353  California, San Jose

答案 1 :(得分:1)

这是一个可能的解决方案:

import json
import pandas as pd

with open('file.json') as f:
    data = json.load(f)

exploded_data = [{'fullname': x['fullname'],
                  'phone': p['phone'],
                  'area': a['area']}
                 for x in data for p, a in zip(x['phones'], x['areas'])]

df = pd.DataFrame(exploded_data)