在python

时间:2017-10-05 01:00:31

标签: python pandas dataframe methods vectorization

我正在尝试将get()方法从包含字典的一列向量化到同一数据帧中的另一列。例如,我希望地址列字典中的城市填充address.city列。

df = pd.DataFrame({'address': [{'city': 'Lake Ashley', 'state': 'MN', 'street': '56833 Baker Branch', 'zip': '15884'},
                           {'city': 'Reginaldfurt', 'state': 'MO',
                               'street': '045 Bennett Motorway Suite 404', 'zip': '68916'},
                           {'city': 'East Stephaniefurt', 'state': 'VI', 'street': '908 Matthew Ports Suite 313', 'zip': '15956-9706'}],
               'address.city': [None, None, None],
               'address.street': [None, None, None]})

我在尝试

df['address.city'].apply(df.address.get('city'))

但这不起作用。我认为我很接近,因为df.address[0].get('city')确实提取了该行的城市价值。你可以想象我想为address.street做同样的事情。

1 个答案:

答案 0 :(得分:2)

我认为你想要的是下面的内容。但是,您可以像这样解析address

df.address.apply(pd.Series).add_prefix('address.')
# or
# pd.DataFrame(df.address.tolist()).add_prefix('address.')

         address.city address.state                  address.street address.zip
0         Lake Ashley            MN              56833 Baker Branch       15884
1        Reginaldfurt            MO  045 Bennett Motorway Suite 404       68916
2  East Stephaniefurt            VI     908 Matthew Ports Suite 313  15956-9706

这回答了你的问题:

df['address.city'] = df.address.apply(lambda d: d['city'])

df

                                             address        address.city address.street
0  {'city': 'Lake Ashley', 'state': 'MN', 'street...         Lake Ashley           None
1  {'city': 'Reginaldfurt', 'state': 'MO', 'stree...        Reginaldfurt           None
2  {'city': 'East Stephaniefurt', 'state': 'VI', ...  East Stephaniefurt           None