Question

我的数据集如下：

Zipcodes  Population  Precipitation
10             10      100
45             20      200
58             30      300
11             40      400
22             50      500
19             60      600

，我想对一些邮政编码（例如10、22、19）进行分组，并将其命名为“ a”。我的输出看起来像：

Regions   Population   Precipitation
a             120          1200
k              90           900

Answer 1

如果每个邮政编码都在列表中定义，例如在字典中：

d = {'a':[10, 22, 19],
     'b':[45,58,11]}
#swap key values in dict
#http://stackoverflow.com/a/31674731/2901002
d1 = {k: oldk for oldk, oldv in d.items() for k in oldv}

df['Zipcodes'] = df['Zipcodes'].map(d1)

df = df.groupby('Zipcodes', as_index=False).sum()
print (df)
  Zipcodes  Population  Precipitation
0        a         120           1200
1        b          90            900

或者如果需要在列表中匹配单独的值以及所有其他值：

df['Zipcodes'] = np.where(df['Zipcodes'].isin([10, 22, 19]), 'a', 'b')
df = df.groupby('Zipcodes', as_index=False).sum()
print (df)
  Zipcodes  Population  Precipitation
0        a         120           1200
1        b          90            900

如何将行分组为一列并命名？

1 个答案: