如何有条件地将熊猫系列附加到另一个数据框

时间:2019-03-21 15:48:12

标签: python pandas dataframe join append

我有两个数据框,一个是出生的人的名字及其每年的频率(1880-2017)。

name    gender  frequency  year
Mary       F       7065    1880
Anna       F       2604    1880
Emma       F       2003    1880
Elizabeth  F       1939    1880
Minnie     F       1746    1880
...

另一个是年份和总出生人数(1880-2017年)。

 birth_year    Male    Female   Total
     1880     118400    97605  216005
     1881     108282    98855  207137
     1882     122031   115695  237726
     1883     112477   120059  232536
     1884     122738   137586  260324
...

这些数据帧的大小不同,但是如果出生年份相同,我想将第二个数据帧的列追加到第一个数据帧,以包括百分比填充。我想做这样的事情:

for i in range(len(all_names_nat_DF)):
    for j in range(len(total_births)):
        if all_names_nat_DF['year'][i] == total_births['birth_year']:
            all_names_nat_DF.append(total_births['birth_year'][j])

但是,这样我得到了错误ValueError: Can only compare identically-labeled Series objects

1 个答案:

答案 0 :(得分:2)

您想使用df.merge

df

    name gender frequency year
0   Mary    F   7065    1880
1   Anna    F   2604    1880
2   Emma    F   2003    1880
3   Eliz    F   1939    1880
4   Minnie  F   1746    1880


births

  birth_year  Male  Female  Total
0   1880    118400  97605   216005
1   1881    108282  98855   207137
2   1882    122031  115695  237726
3   1883    112477  120059  232536
4   1884    122738  137586  260324

df.merge(births, how='inner', left_on='year', right_on='birth_year')

    name gender frequency year birth_year Male  Female  Total
0   Mary    F   7065    1880    1880    118400  97605   216005
1   Anna    F   2604    1880    1880    118400  97605   216005
2   Emma    F   2003    1880    1880    118400  97605   216005
3   Eliz    F   1939    1880    1880    118400  97605   216005
4   Minnie  F   1746    1880    1880    118400  97605   216005