Pandas Groupby中的重复键结果

时间:2016-08-31 08:53:03

标签: python pandas

我正在使用Pandas来过滤身份验证日志。我想只选择那些只有一种授权配置文件(条件'CALLING_STATION_ID')的客户端(由列名df['\'SELECTED_AZN_PROFILES\'']=='\'AD_Access\''标识)。

由于客户端可能多次重复身份验证,因此我使用的是Pandas GroupBy函数(df.groupby('\'CALLING_STATION_ID\''))。当我迭代GroupBy时,我希望键是唯一的,但是我发现键是重复的。如何获得符合固定标准的唯一键? GroupBy不是要走的路?

以下是示例数据:

'LOGGED AT','ENDPOINTMATCHEDPROFILE','IDENTITY_STORE','CTS_SECURITY_GROUP','NAS_PORT_ID','SELECTED_AZN_PROFILES','SERVER','RESPONSE_TIME','AUDIT_SESSION_ID','ID','MESSAGETEXT','SERVICE_TYPE','FAILURE_REASON','CREDENTIAL_CHECK','IDENTITY_GROUP','NETWORK_DEVICE_NAME','CALLING_STATION_ID','AUTHENTICATION_PROTOCOL','LOCATION','PASSED','AUTHENTICATION_METHOD','USER_NAME','DEVICE_TYPE','EXECUTION_STEPS','NAS_IP_ADDRESS'
'2016-08-30 16:23:08.35','HP-Device','AD1',,'GigabitEthernet6/0/12','AD_Access','ISE','17','0A6301640007425BAF23436B','1471419483080056','Authentication succeeded','Framed',,'MSCHAPV2','Profiled','SwitchA','StationA','PEAP (EAP-MSCHAPv2)','All Locations#Hotel','1','dot1x','StationA','All Device Types',"11001,11017,15049,15008,15048,15048,15048,15048,15048,15004,11507,12500,12625,11006,11001,11018,12301,12300,12625,11006,11001,11018,12302,12318,12800,12805,12806,12807,12810,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12318,12812,12813,12804,12801,12802,12816,12310,12305,11006,11001,11018,12304,12313,11521,12305,11006,11001,11018,12304,11522,11806,12305,11006,11001,11018,12304,11808,15041,15048,15048,15048,15006,15013,24431,24325,24313,24319,24323,24343,24470,22037,11824,12305,11006,11001,11018,12304,11810,11814,11519,12314,12305,11006,11001,11018,12304,15036,15048,24433,24355,24435,15048,15004,15016,11022,12306,11503,11002",'1.1.1.1'
'2016-08-29 23:44:24.254','Workstation','AD1',,'GigabitEthernet8/0/30','AD_Access','ISE','17','0A63016400073C105F18AF70','1471419482619976','Authentication succeeded','Framed',,'MSCHAPV2','Workstation','SwitchA','StationB','PEAP (EAP-MSCHAPv2)','All Locations#Hotel','1','dot1x','StationB','All Device Types',"11001,11017,15049,15008,15048,15048,15048,15048,15048,15004,11507,12500,12625,11006,11001,11018,12301,12300,12625,11006,11001,11018,12302,12318,12800,12805,12806,12807,12810,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12318,12812,12813,12804,12801,12802,12816,12310,12305,11006,11001,11018,12304,12313,11521,12305,11006,11001,11018,12304,11522,11806,12305,11006,11001,11018,12304,11808,15041,15048,15048,15048,15006,15013,24431,24325,24313,24319,24323,24343,24470,22037,11824,12305,11006,11001,11018,12304,11810,11814,11519,12314,12305,11006,11001,11018,12304,15036,15048,24433,24355,24435,15048,15004,15016,11022,12306,11503,11002",'1.1.1.1'
'2016-08-27 02:16:16.327','Workstation','AD1',,'GigabitEthernet8/0/31','AD_Access','ISE','16','0A630164000740DD98214677','1471419481030819','Authentication succeeded','Framed',,'MSCHAPV2','Workstation','SwitchA','StationC','PEAP (EAP-MSCHAPv2)','All Locations#Hotel','1','dot1x','StationC','All Device Types',"11001,11017,15049,15008,15048,15048,15048,15048,15048,15004,11507,12500,12625,11006,11001,11018,12301,12300,12625,11006,11001,11018,12302,12318,12800,12805,12806,12807,12810,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12318,12812,12813,12804,12801,12802,12816,12310,12305,11006,11001,11018,12304,12313,11521,12305,11006,11001,11018,12304,11522,11806,12305,11006,11001,11018,12304,11808,15041,15048,15048,15048,15006,15013,24431,24325,24313,24319,24323,24343,24470,22037,11824,12305,11006,11001,11018,12304,11810,11814,11519,12314,12305,11006,11001,11018,12304,15036,15048,24433,24355,24435,15048,15004,15016,11022,12306,11503,11002",'1.1.1.1'
'2016-08-29 11:51:22.83','Workstation','AD1',,'GigabitEthernet4/0/23','AD_Access','ISE','16','0A63016E000A49E237AF288C','1471419482276506','Authentication succeeded','Framed',,'MSCHAPV2','Workstation','SwitchB','StationD','PEAP (EAP-MSCHAPv2)','All Locations#Hotel','1','dot1x','StationD','All Device Types',"11001,11017,15049,15008,15048,15048,15048,15048,15048,15004,11507,12500,12625,11006,11001,11018,12301,12300,12625,11006,11001,11018,12302,12318,12800,12805,12806,12807,12810,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12318,12812,12813,12804,12801,12802,12816,12310,12305,11006,11001,11018,12304,12313,11521,12305,11006,11001,11018,12304,11522,11806,12305,11006,11001,11018,12304,11808,15041,15048,15048,15048,15006,15013,24431,24325,24313,24319,24323,24343,24470,22037,11824,12305,11006,11001,11018,12304,11810,11814,11519,12314,12305,11006,11001,11018,12304,15036,15048,24433,24355,24435,15048,15004,15016,11022,12306,11503,11002",'1.1.1.1'
'2016-08-28 02:13:21.3','Workstation','AD1',,'GigabitEthernet4/0/23','AD_Access','ISE','15','0A63016E000A49E237AF288C','1471419481444736','Authentication succeeded','Framed',,'MSCHAPV2','Workstation','SwitchB','StationD','PEAP (EAP-MSCHAPv2)','All Locations#Hotel','1','dot1x','StationD','All Device Types',"11001,11017,15049,15008,15048,15048,15048,15048,15048,15004,11507,12500,12625,11006,11001,11018,12301,12300,12625,11006,11001,11018,12302,12318,12800,12805,12806,12807,12810,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12318,12812,12813,12804,12801,12802,12816,12310,12305,11006,11001,11018,12304,12313,11521,12305,11006,11001,11018,12304,11522,11806,12305,11006,11001,11018,12304,11808,15041,15048,15048,15048,15006,15013,24431,24325,24313,24319,24323,24343,24470,22037,11824,12305,11006,11001,11018,12304,11810,11814,11519,12314,12305,11006,11001,11018,12304,15036,15048,24433,24355,24435,15048,15004,15016,11022,12306,11503,11002",'1.1.1.1'
'2016-08-26 22:55:19.754','Workstation','AD1',,'GigabitEthernet4/0/23','AD_Access','ISE','16','0A63016E000A49E237AF288C','1471419480981867','Authentication succeeded','Framed',,'MSCHAPV2','Workstation','SwitchB','StationD','PEAP (EAP-MSCHAPv2)','All Locations#Hotel','1','dot1x','StationD','All Device Types',"11001,11017,15049,15008,15048,15048,15048,15048,15048,15004,11507,12500,12625,11006,11001,11018,12301,12300,12625,11006,11001,11018,12302,12318,12800,12805,12806,12807,12810,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12318,12812,12813,12804,12801,12802,12816,12310,12305,11006,11001,11018,12304,12313,11521,12305,11006,11001,11018,12304,11522,11806,12305,11006,11001,11018,12304,11808,15041,15048,15048,15048,15006,15013,24431,24325,24313,24319,24323,24343,24470,22037,11824,12305,11006,11001,11018,12304,11810,11814,11519,12314,12305,11006,11001,11018,12304,15036,15048,24433,24355,24435,15048,15004,15016,11022,12306,11503,11002",'1.1.1.1'
'2016-08-25 10:22:17.88','Workstation','AD1',,'GigabitEthernet4/0/23','AD_Access','ISE','16','0A63016E000A49E237AF288C','1471419480335106','Authentication succeeded','Framed',,'MSCHAPV2','Workstation','SwitchB','StationD','PEAP (EAP-MSCHAPv2)','All Locations#Hotel','1','dot1x','StationD','All Device Types',"11001,11017,15049,15008,15048,15048,15048,15048,15048,15004,11507,12500,12625,11006,11001,11018,12301,12300,12625,11006,11001,11018,12302,12318,12800,12805,12806,12807,12810,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12318,12812,12813,12804,12801,12802,12816,12310,12305,11006,11001,11018,12304,12313,11521,12305,11006,11001,11018,12304,11522,11806,12305,11006,11001,11018,12304,11808,15041,15048,15048,15048,15006,15013,24431,24325,24313,24319,24323,24343,24470,22037,11824,12305,11006,11001,11018,12304,11810,11814,11519,12314,12305,11006,11001,11018,12304,15036,15048,24433,24355,24435,15048,15004,15016,11022,12306,11503,11002",'1.1.1.1'
'2016-08-24 04:04:46.554','Workstation','AD1',,'GigabitEthernet4/0/23','AD_Access','ISE','16','0A63016E000A49E237AF288C','1471419479854979','Authentication succeeded','Framed',,'MSCHAPV2','Workstation','SwitchB','StationD','PEAP (EAP-MSCHAPv2)','All Locations#Hotel','1','dot1x','StationD','All Device Types',"11001,11017,15049,15008,15048,15048,15048,15048,15048,15004,11507,12500,12625,11006,11001,11018,12301,12300,12625,11006,11001,11018,12302,12318,12800,12805,12806,12807,12810,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12305,11006,11001,11018,12304,12318,12812,12813,12804,12801,12802,12816,12310,12305,11006,11001,11018,12304,12313,11521,12305,11006,11001,11018,12304,11522,11806,12305,11006,11001,11018,12304,11808,15041,15048,15048,15048,15006,15013,24431,24325,24313,24319,24323,24343,24470,22037,11824,12305,11006,11001,11018,12304,11810,11814,11519,12314,12305,11006,11001,11018,12304,15036,15048,24433,24355,24435,15048,15004,15016,11022,12306,11503,11002",'1.1.1.1'

以下是我正在使用的代码:

import pandas as pd
import numpy as np
from pandas import DataFrame, read_csv

uniq = pd.Series.unique

def only_machine(g):
    return (len(uniq(g['\'SELECTED_AZN_PROFILES\''])) == 1 ) and uniq(g['\'SELECTED_AZN_PROFILES\''])[0] == '\'AD_Access\''

mach_auth         = gb.filter(lambda g: only_machine(g)
        )[['\'NETWORK_DEVICE_NAME\'','\'USER_NAME\'','\'CALLING_STATION_ID\'','\'SELECTED_AZN_PROFILES\'']]

结果如下:

>>> df = pd.read_clipboard(sep=',')
>>> gb = df.groupby('\'CALLING_STATION_ID\'')

>>> mach_auth         = gb.filter(lambda g: only_machine(g)                                                     
...         )[['\'NETWORK_DEVICE_NAME\'','\'USER_NAME\'','\'CALLING_STATION_ID\'','\'SELECTED_AZN_PROFILES\'']] 
>>> mach_auth                                                                                                   
  'NETWORK_DEVICE_NAME' 'USER_NAME' 'CALLING_STATION_ID'  \                                                     
0             'SwitchA'  'StationA'           'StationA'                                                        
1             'SwitchA'  'StationB'           'StationB'                                                        
2             'SwitchA'  'StationC'           'StationC'                                                        
3             'SwitchB'  'StationD'           'StationD'                                                        
4             'SwitchB'  'StationD'           'StationD'                                                        
5             'SwitchB'  'StationD'           'StationD'                                                        
6             'SwitchB'  'StationD'           'StationD'                                                        
7             'SwitchB'  'StationD'           'StationD'                                                        

  'SELECTED_AZN_PROFILES'                                                                                       
0             'AD_Access'                                                                                       
1             'AD_Access'                                                                                       
2             'AD_Access'                                                                                       
3             'AD_Access'                                                                                       
4             'AD_Access'                                                                                       
5             'AD_Access'                                                                                       
6             'AD_Access'                                                                                       
7             'AD_Access'                                                                  

虽然我期待这个:

>>> mach_auth
  'NETWORK_DEVICE_NAME' 'USER_NAME' 'CALLING_STATION_ID'  \
0             'SwitchA'  'StationA'           'StationA'
1             'SwitchA'  'StationB'           'StationB'
2             'SwitchA'  'StationC'           'StationC'
3             'SwitchB'  'StationD'           'StationD'

  'SELECTED_AZN_PROFILES'
0             'AD_Access'
1             'AD_Access'
2             'AD_Access'
3             'AD_Access'

0 个答案:

没有答案