无法通过pandas在组后面设置列名?

时间:2018-01-12 13:10:42

标签: python pandas pymongo

我已经运行了一个pymongo查询并将此信息提取到名为" crime"的pandas数据框中。我现在正试图按照主要类型'进行分组。和"年份'并计算记录并将计数显示为数据框中的标题以进行绘图。我已尝试过以下操作,但我无法获得计数出现的标题。

new = crime.drop('_id', axis=1)
g = crime.drop('Date', axis=1)
g = new.groupby(['Primary Type', 'Year'])
g.columns = ['Count', 'Primary Type', 'Year']

我也尝试过:

g = new.groupby(['Primary Type', 'Year']).
['_id']count().reset_index(name="Count")


File "<ipython-input-100-5d65646da11c>", line 3
g = new.groupby(['Primary Type', 'Year']).
['_id']count().reset_index(name="Count")
                                          ^
SyntaxError: invalid sy

1 个答案:

答案 0 :(得分:1)

我相信你需要因为前后很长的代码()或者在末尾使用\

g = (new.groupby(['Primary Type', 'Year'])['_id']
        .count()
        .reset_index(name="Count")
        .reindex(columns=['Count', 'Primary Type', 'Year']))

g = new.groupby(['Primary Type', 'Year'])['_id'] \
       .count()  \
       .reset_index(name="Count") \
       .reindex(columns=['Count', 'Primary Type', 'Year'])

同样省略了drop代码,并为更改列名称顺序添加了reindex

样品:

new = pd.DataFrame({'Primary Type':list('aaabbb'),
                   'Year':[2001,2001,2002,2002,2002,2002],
                   '_id':[7,8,9,4,2,3],
                   'Date':[1,3,5,7,1,0]})

print (new)
   Date Primary Type  Year  _id
0     1            a  2001    7
1     3            a  2001    8
2     5            a  2002    9
3     7            b  2002    4
4     1            b  2002    2
5     0            b  2002    3

g = (new.groupby(['Primary Type', 'Year'])['_id']
        .count()
        .reset_index(name="Count")
        .reindex(columns=['Count', 'Primary Type', 'Year']))
print (g)
   Count Primary Type  Year
0      2            a  2001
1      1            a  2002
2      3            b  2002