Question

我是学生，因此是菜鸟。我正在尝试按旧金山社区创建犯罪统计的熊猫数据框。我的问题是我希望列名只是“ Neighborhood”和“ Count”。取而代之的是，我似乎受制于单独的一行，上面写着“（'Neighborhood'，'count'）”，而不是适当的标签。这是代码：

df_counts = df_incidents.copy()
df_counts.rename(columns={'PdDistrict':'Neighborhood'}, inplace=True)
df_counts.drop(['IncidntNum', 'Category', 'Descript', 'DayOfWeek', 'Date', 'Time', 'Location', 'Resolution', 'Address', 'X', 'Y', 'PdId'], axis=1, inplace=True)
df_totals=df_counts.groupby(['Neighborhood']).agg({'Neighborhood':['count']})
df_totals.columns = list(map(str, df_totals.columns)) # Not sure if I need this
df_totals

输出：

('Neighborhood', 'count')
Neighborhood    
BAYVIEW     14303
CENTRAL     17666
INGLESIDE   11594
MISSION     19503
NORTHERN    20100
PARK        8699
RICHMOND    8922
SOUTHERN    28445
TARAVAL     11325
TENDERLOIN  9942

Answer 1

这里不需要agg()，您只需执行以下操作即可：

df_totals = df_counts.groupby(['Neighborhood']).count()
df_totals.columns = ['count']
df_totals = df_totals.reset_index() # flatten the column headers

如果要打印不带数字索引的输出：

print(df_totals.to_string(index=False))

熊猫数据框列名称似乎错误

1 个答案: