我的数据集如下:
Zipcodes Population Precipitation
10 10 100
45 20 200
58 30 300
11 40 400
22 50 500
19 60 600
,我想对一些邮政编码(例如10、22、19)进行分组,并将其命名为“ a”。我的输出看起来像:
Regions Population Precipitation
a 120 1200
k 90 900
答案 0 :(得分:0)
如果每个邮政编码都在列表中定义,例如在字典中:
d = {'a':[10, 22, 19],
'b':[45,58,11]}
#swap key values in dict
#http://stackoverflow.com/a/31674731/2901002
d1 = {k: oldk for oldk, oldv in d.items() for k in oldv}
df['Zipcodes'] = df['Zipcodes'].map(d1)
df = df.groupby('Zipcodes', as_index=False).sum()
print (df)
Zipcodes Population Precipitation
0 a 120 1200
1 b 90 900
或者如果需要在列表中匹配单独的值以及所有其他值:
df['Zipcodes'] = np.where(df['Zipcodes'].isin([10, 22, 19]), 'a', 'b')
df = df.groupby('Zipcodes', as_index=False).sum()
print (df)
Zipcodes Population Precipitation
0 a 120 1200
1 b 90 900