Question

这是我原始数据框的最小可重复示例，称为“调用”：

       phone_number    call_outcome   agent  call_number
0      83473306392   NOT INTERESTED  orange            0
1     762850680150  CALL BACK LATER  orange            1
2     476309275079   NOT INTERESTED  orange            2
3     899921761538  CALL BACK LATER     red            3
4     906739234066  CALL BACK LATER  orange            4

编写这个pandas命令......

most_calls = calls.groupby('agent') \
.count().sort('call_number', ascending=False)

返回此...

           phone_number  call_outcome  call_number
agent                                          
orange          2234          2234         2234
red             1478          1478         1478
black            750           750          750
green            339           339          339
blue             199           199          199

这是正确的，但事实上我希望'agent'是变量而不是索引。

我在很多场合都使用过as_index=False函数，熟悉指定axis=1。但是在这种情况下，无论在何处或如何合并这些参数都无关紧要，每个排列都会返回错误。

这些是我尝试的一些例子和相应的错误：

most_calls = calls.groupby('agent', as_index=False) \
.count().sort('call_number', ascending=False)

ValueError: invalid literal for long() with base 10: 'black'

和

most_calls = calls.groupby('agent', as_index=False, axis=1) \
.count().sort('call_number', ascending=False)

ValueError: as_index=False only valid for axis=0

Answer 1

我相信，无论您执行了groupby操作，您只需要调用reset_index来表示索引列应该只是一个常规列。

从您的数据模型开始：

import pandas as pd
calls = pd.DataFrame({
    'agent': ['orange', 'red'],
    'phone_number': [2234, 1478],
    'call_outcome': [2234, 1478],
})
>> calls
    agent   call_outcome    phone_number
0   orange  2234    2234
1   red     1478    1478

这是您在附加reset_index()后所做的操作：

>> calls.groupby('agent').count().sort('phone_number', ascending=False).reset_index()
    agent   call_outcome    phone_number
0   orange  1   1
1   red     1   1

Answer 2

使用reset_index将索引移至普通列。

calls.groupby('agent').count().sort('call_number', ascending=False).reset_index()

Out[117]: 
      agent  phone_number  call_outcome  call_number
0    orange             4             4            4
1       red             1             1            1

pandas'as_index'函数不能按预期工作

2 个答案: