我已经运行了一个pymongo查询并将此信息提取到名为" crime"的pandas数据框中。我现在正试图按照主要类型'进行分组。和"年份'并计算记录并将计数显示为数据框中的标题以进行绘图。我已尝试过以下操作,但我无法获得计数出现的标题。
new = crime.drop('_id', axis=1)
g = crime.drop('Date', axis=1)
g = new.groupby(['Primary Type', 'Year'])
g.columns = ['Count', 'Primary Type', 'Year']
我也尝试过:
g = new.groupby(['Primary Type', 'Year']).
['_id']count().reset_index(name="Count")
File "<ipython-input-100-5d65646da11c>", line 3
g = new.groupby(['Primary Type', 'Year']).
['_id']count().reset_index(name="Count")
^
SyntaxError: invalid sy
答案 0 :(得分:1)
我相信你需要因为前后很长的代码()
或者在末尾使用\
:
g = (new.groupby(['Primary Type', 'Year'])['_id']
.count()
.reset_index(name="Count")
.reindex(columns=['Count', 'Primary Type', 'Year']))
g = new.groupby(['Primary Type', 'Year'])['_id'] \
.count() \
.reset_index(name="Count") \
.reindex(columns=['Count', 'Primary Type', 'Year'])
同样省略了drop
代码,并为更改列名称顺序添加了reindex
。
样品:
new = pd.DataFrame({'Primary Type':list('aaabbb'),
'Year':[2001,2001,2002,2002,2002,2002],
'_id':[7,8,9,4,2,3],
'Date':[1,3,5,7,1,0]})
print (new)
Date Primary Type Year _id
0 1 a 2001 7
1 3 a 2001 8
2 5 a 2002 9
3 7 b 2002 4
4 1 b 2002 2
5 0 b 2002 3
g = (new.groupby(['Primary Type', 'Year'])['_id']
.count()
.reset_index(name="Count")
.reindex(columns=['Count', 'Primary Type', 'Year']))
print (g)
Count Primary Type Year
0 2 a 2001
1 1 a 2002
2 3 b 2002