Question

我在执行groupby排名时收到了ValueError。

如何正确计算分组排名？

df = pd.concat([pd.DataFrame(dict(col1=[1,2,3], col2=[4,5,6])), 
                pd.DataFrame(dict(col1=[1,2,3], col2=[7,8,9]))])
df.groupby('col1').col2.rank()

使用ValueError

Traceback (most recent call last):
  File "<input>", line 1, in <module>
NameError: name 'col1' is not defined
df.groupby('col1').col2.rank()
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "<string>", line 35, in rank
  File "/Users/stephenpettinato/.virtualenvs/newt-env/lib/python2.7/site-packages/pandas/core/groupby.py", line 592, in wrapper
    raise ValueError
ValueError

Answer 1

此处的问题是输入数据帧的不同行的索引值之一。

df.index
Int64Index([0, 1, 2, 0, 1, 2], dtype='int64')

重置索引可以缓解此错误。

df = pd.concat([pd.DataFrame(dict(col1=[1, 2, 3], col2=[4, 5, 6])),
                pd.DataFrame(dict(col1=[1, 2, 3], col2=[7, 8, 9]))])
df.reset_index().groupby('col1').col2.rank()

使用Python 2中的Pandas执行df.groupby（'col1'）。col2.rank（）时的ValueError

1 个答案: