如何将列名传递给Pandas中groupby函数中的level参数?

时间:2017-10-20 14:28:36

标签: python pandas pandas-groupby

我在Pandas的groupby函数中传递级别名称时遇到问题。我的数据帧非常大,有34列。

Shpr_Resi_Ratio = (
    data[data.Resi == 'Y'].groupby(level='Shpr_ID').count() /
    data.groupby(level='Shpr_ID').count()
)

错误

2523                     raise ValueError('level name %s is not the name of the '
-> 2524                                      'index' % level)
   2525             elif level > 0 or level < -1:
   2526                 raise ValueError('level > 0 or level < -1 only valid with '

ValueError: level name Shpr_ID is not the name of the index

如何解决问题

示例数据框

 Stop_Type  Resi    Co_Name Lat Lng Cust_ID Qty Phone   Shpr_ID
0   D   N   ROBECO HONG KONG    22.283737   114.156219  NaN 1   0   348772830.0
1   D   N   NIKKO ASSET MANAGEMENT HK LIMI  22.283737   114.156219  NaN 1   85239403900 811633127.0
2   D   N   CFA INSTITUTE HONG KONG OFFICE  22.283737   114.156219  NaN 1   8.52E+11    22901265.0
3   D   N   VICTON REGISTRATIONS LIMITED    22.283144   114.155122  NaN 1   85228450884 269243180.0
4   D   N   DING FUNG LIMITED   22.282634   114.155592  NaN 1   85223919307 100724987.0
5   D   N   QUAM LIMITED    22.281737   114.156819  NaN 6   85222172878 193550630.0
6   D   N   CANARA BANK 22.281737   114.156819  NaN 1   85225291398 911433524.0
7   D   N   GIA HONG KONG   22.281737   114.156819  NaN 1   85223030075 90470655.0
8   D   Y   ZAABA CAPITAL LIMITED   22.281737   114.156819  NaN 1   8772461225  260103490.0
9   D   N   FIRESTAR DIAMOND HK 22.280644   114.158432  NaN 1   25303677    659886588.0

我正在尝试计算两个比率的比率。

Resi    Shpr_ID Shpr_ID_Ratio
Y   577030944   0.933333333
N   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333
Y   577030944   0.933333333

2 个答案:

答案 0 :(得分:0)

您是否尝试按“Shpr_ID”列进行分组?

在这种情况下,将代码更改为:

Shpr_Resi_Ratio = (
    data[data.Resi == 'Y'].groupby(['Shpr_ID']).count() /
    float(data.groupby(['Shpr_ID']).count())
)

应该照顾好。

答案 1 :(得分:0)

Shpr_ID_total=data.groupby(['Shpr_ID']).agg({'Shpr_ID': 'count'})
Shpr_ID_Y=data[data['Resi'] == 'Y'].groupby(['Shpr_ID']).agg({'Shpr_ID': 'count'})

def computeResi(Shpr_ID):
    ratio=0

    try:
        ratio=Shpr_ID_Y.Shpr_ID[Shpr_ID]/Shpr_ID_total.Shpr_ID[Shpr_ID]
    except:
        pass

    return ratio