我可以使用一个字典在名为df
的数据帧上成功创建一个新属性,如下所示:
mye2_men
:In [13]: mye2_men = pd.read_csv("~/03_Maps_March_2020/mye2_men.csv",index_col="Code")
...: mye2_men.head()
Out[13]:
Name Geography1 All ages 0 1 2 3 4 5 6 7 8 9 ... 78 79 80 81 82 83 84 85 86 87 88 89 90
Code ...
K02000001 UNITED KINGDOM Country 32790202 382332 395273 408684 408882 412553 421934 434333 427809 419161 414994 ... 192839 186251 175626 160475 146314 132941 116050 103669 93155 81174 68110 55652 183486
K03000001 GREAT BRITAIN Country 31864002 370474 382933 395754 396181 399764 409061 420947 414613 406062 401647 ... 188073 181546 171350 156506 142851 129815 113306 101194 91038 79342 66699 54387 179629
K04000001 ENGLAND AND WALES Country 29215251 343642 355122 366722 366885 370156 379046 389944 382853 375940 370701 ... 172046 166392 157065 143896 131207 119193 104143 93055 83798 73224 61794 50297 167009
E92000001 ENGLAND Country 27667942 327309 338368 349229 349199 352148 360688 370995 363496 356965 351790 ... 161540 156343 147733 135514 123492 112133 98000 87528 79030 69067 58264 47498 157788
E12000001 NORTH EAST Region 1305486 13992 14423 15124 15159 15542 15839 16314 16283 16068 15748 ... 8130 8108 7601 6977 6118 5723 4958 4383 3889 3360 2747 2148 6822
[5 rows x 94 columns]
df
In [14]: df = pd.DataFrame({"A":[num for num in range(0,430)],
...: "B":[num**2 for num in range(0,430)],
...: "Code":mye2_men.index})
...: df.head()
Out[14]:
A B Code
0 0 0 K02000001
1 1 1 K03000001
2 2 4 K04000001
3 3 9 E92000001
4 4 16 E12000001
In [15]: male_counts = mye2_men["All ages"].to_dict()
...: male_counts
Out[15]:
{'K02000001': 32790202,
'K03000001': 31864002,
'K04000001': 29215251,
'E92000001': 27667942,
'E12000001': 1305486,
'E06000047': 259299,
'E06000005': 51919,
'E06000001': 45524 ....}
df
以创建新列“ male_count”:In [19]: # CREATE NEW male_count COLUMN IN df
...: df["male_count"] = df["Code"].map(male_counts)
...: df.head()
Out[19]:
A B Code male_count
0 0 0 K02000001 32790202
1 1 1 K03000001 31864002
2 2 4 K04000001 29215251
3 3 9 E92000001 27667942
4 4 16 E12000001 1305486
第二本字典:
In [20]: female_counts = (mye2_men["All ages"]+10).to_dict()
...: female_counts
Out[20]:
{'K02000001': 32790212,
'K03000001': 31864012,
'K04000001': 29215261,
'E92000001': 27667952,
'E12000001': 1305496,
'E06000047': 259309,
'E06000005': 51929 ...}
通过重复上面的步骤4,我可以成功生成第二个属性df["female_count"]
,但这一次使用female_counts字典。
如何在一个步骤中创建多个新的df
列(即df["male_count"]
和df["female_count"]
)?
非常感谢
注意: mye2_men数据来自以下excel文档的“ MYE2-男性”标签: https://www.ons.gov.uk/file?uri=%2fpeoplepopulationandcommunity%2fpopulationandmigration%2fpopulationestimates%2fdatasets%2fpopulationestimatesforukenglandandwalesscotlandandnorthernireland%2fmid2019april2020localauthoritydistrictcodes/ukmidyearestimates20192020ladcodes.xls
答案 0 :(得分:1)
根据字典创建DataFrame
,然后使用DataFrame.join
:
new = pd.DataFrame({'male_count':male_counts, 'female_count':female_count})
df = df.join(new, on='Code')