假设我有一个数据集(df_data
),如下所示:
Time Geography Population
2016 England and Wales 58381200
2017 England and Wales 58744600
2016 Northern Ireland 1862100
2017 Northern Ireland 1870800
2016 Scotland 5404700
2017 Scotland 5424800
2016 Wales 3113200
2017 Wales 3125200
如果我执行以下操作:
df_nireland = df_data[df_data['Geography']=='Northern Ireland']
df_wales = df_data[df_data['Geography']=='Wales']
df_scotland = df_data[df_data['Geography']=='Scotland']
df_engl_n_wales = df_data[df_data['Geography']=='England and Wales']
df_england = df_engl_n_wales
df_england['Population'] = df_engl_n_wales['Population'] - df_wales['Population']
然后df_england
在列Population
上具有NA值。
我该如何解决?
顺便说一句,我已经阅读了相关文章,但确实为我工作(.loc
,.copy
等)。
答案 0 :(得分:1)
我只需执行以下操作即可:
df_nireland = df_data[df_data['Geography']=='Northern Ireland'].reset_index(drop=True)
df_wales = df_data[df_data['Geography']=='Wales'].reset_index(drop=True)
df_scotland = df_data[df_data['Geography']=='Scotland'].reset_index(drop=True)
df_engl_n_wales = df_data[df_data['Geography']=='England and Wales'].reset_index(drop=True)
df_england = df_engl_n_wales
df_england['Population'] = df_engl_n_wales['Population'] - df_wales['Population']
或者原则上更好的方法,因为您保留了初始数据帧的索引,如下所示:
df_nireland = df_data[df_data['Geography']=='Northern Ireland']
df_wales = df_data[df_data['Geography']=='Wales']
df_scotland = df_data[df_data['Geography']=='Scotland']
df_engl_n_wales = df_data[df_data['Geography']=='England and Wales']
df_england = df_engl_n_wales
df_england['Population'] = df_engl_n_wales['Population'] - df_wales['Population'].values
答案 1 :(得分:0)
这确实是一个组织问题。如果您pivot
,则可以轻松进行减法,并确保在Time
df_pop = df.pivot(index='Time', columns='Geography', values='Population')
df_pop['England'] = df_pop['England and Wales'] - df_pop['Wales']
df_pop
:Geography England and Wales Northern Ireland Scotland Wales England
Time
2016 58381200 1862100 5404700 3113200 55268000
2017 58744600 1870800 5424800 3125200 55619400
如果您需要恢复原始格式,则可以执行以下操作:
df_pop.stack().to_frame('Population').reset_index()
# Time Geography Population
#0 2016 England and Wales 58381200
#1 2016 Northern Ireland 1862100
#2 2016 Scotland 5404700
#3 2016 Wales 3113200
#4 2016 England 55268000
#5 2017 England and Wales 58744600
#6 2017 Northern Ireland 1870800
#7 2017 Scotland 5424800
#8 2017 Wales 3125200
#9 2017 England 55619400