从多个词典创建多个新列

时间:2020-06-02 12:05:05

标签: python-3.x pandas dataframe

我可以使用一个字典在名为df的数据帧上成功创建一个新属性,如下所示:

  1. 创建前体数据框架mye2_men
In [13]: mye2_men = pd.read_csv("~/03_Maps_March_2020/mye2_men.csv",index_col="Code")   
    ...: mye2_men.head()                                                                                                                                                                                                                 
Out[13]: 
                        Name Geography1  All ages       0       1       2       3       4       5       6       7       8       9  ...      78      79      80      81      82      83      84      85     86     87     88     89      90
Code                                                                                                                               ...                                                                                                    
K02000001     UNITED KINGDOM    Country  32790202  382332  395273  408684  408882  412553  421934  434333  427809  419161  414994  ...  192839  186251  175626  160475  146314  132941  116050  103669  93155  81174  68110  55652  183486
K03000001      GREAT BRITAIN    Country  31864002  370474  382933  395754  396181  399764  409061  420947  414613  406062  401647  ...  188073  181546  171350  156506  142851  129815  113306  101194  91038  79342  66699  54387  179629
K04000001  ENGLAND AND WALES    Country  29215251  343642  355122  366722  366885  370156  379046  389944  382853  375940  370701  ...  172046  166392  157065  143896  131207  119193  104143   93055  83798  73224  61794  50297  167009
E92000001            ENGLAND    Country  27667942  327309  338368  349229  349199  352148  360688  370995  363496  356965  351790  ...  161540  156343  147733  135514  123492  112133   98000   87528  79030  69067  58264  47498  157788
E12000001         NORTH EAST     Region   1305486   13992   14423   15124   15159   15542   15839   16314   16283   16068   15748  ...    8130    8108    7601    6977    6118    5723    4958    4383   3889   3360   2747   2148    6822

[5 rows x 94 columns]
  1. 创建目标数据框df
In [14]: df = pd.DataFrame({"A":[num for num in range(0,430)],  
    ...:                    "B":[num**2 for num in range(0,430)],   
    ...:                    "Code":mye2_men.index})  
    ...: df.head()                                                                                                                                                                                                                       
Out[14]: 
   A   B       Code
0  0   0  K02000001
1  1   1  K03000001
2  2   4  K04000001
3  3   9  E92000001
4  4  16  E12000001
  1. 创建要在映射中使用的字典:
In [15]: male_counts = mye2_men["All ages"].to_dict()  
    ...: male_counts                                                                                                                                                                                                                     
Out[15]: 
{'K02000001': 32790202,
 'K03000001': 31864002,
 'K04000001': 29215251,
 'E92000001': 27667942,
 'E12000001': 1305486,
 'E06000047': 259299,
 'E06000005': 51919,
 'E06000001': 45524 ....}
  1. 将male_counts字典映射到DataFrame df以创建新列“ male_count”:
In [19]: # CREATE NEW male_count COLUMN IN df 
    ...: df["male_count"] = df["Code"].map(male_counts) 
    ...: df.head()                                                                                                                                                                                                                       
Out[19]: 
   A   B       Code  male_count
0  0   0  K02000001    32790202
1  1   1  K03000001    31864002
2  2   4  K04000001    29215251
3  3   9  E92000001    27667942
4  4  16  E12000001     1305486

第二本字典:

In [20]: female_counts = (mye2_men["All ages"]+10).to_dict() 
    ...: female_counts                                                                                                                                                                                                                   
Out[20]: 
{'K02000001': 32790212,
 'K03000001': 31864012,
 'K04000001': 29215261,
 'E92000001': 27667952,
 'E12000001': 1305496,
 'E06000047': 259309,
 'E06000005': 51929 ...}

通过重复上面的步骤4,我可以成功生成第二个属性df["female_count"],但这一次使用female_counts字典。

如何在一个步骤中创建多个新的df列(即df["male_count"]df["female_count"])?

非常感谢

注意: mye2_men数据来自以下excel文档的“ MYE2-男性”标签: https://www.ons.gov.uk/file?uri=%2fpeoplepopulationandcommunity%2fpopulationandmigration%2fpopulationestimates%2fdatasets%2fpopulationestimatesforukenglandandwalesscotlandandnorthernireland%2fmid2019april2020localauthoritydistrictcodes/ukmidyearestimates20192020ladcodes.xls

1 个答案:

答案 0 :(得分:1)

根据字典创建DataFrame,然后使用DataFrame.join

new = pd.DataFrame({'male_count':male_counts, 'female_count':female_count})
df = df.join(new, on='Code')