Python2.7:在DataFrame组内进行比较并根据条件进行过滤

时间:2018-02-25 12:54:50

标签: python pandas dataframe pandas-groupby

我有一个大熊猫数据框,我计划按照名称',' driverRef','轮胎'并仅过滤一列中具有相似值的组。

在该组中,所有行在该列中具有相同的值。

类似地定义为值之间的差异最多为3的范围。例如。如果列中的唯一数字是5,10,12,13,则只保留10,12,13组。

编辑:我最初计划的相似性标准含糊不清,我已将其改为简单的模式。

Select id, someday, somevalue, (select sum(somevalue) 
                                from testtable as t2
                                where t2.id = t1.id
                                and t2.someday <= t1.someday) as runningtotal
from testtable as t1
order by id,someday;

预期产出:

    name                   driverRef stint  tyre      lap   stint length     
0   Australian Grand Prix   ham     1.0     Super soft  1    5      
1   Australian Grand Prix   vettel  1.0     Super soft  2    10       
2   Australian Grand Prix   bottas  1.0     Super soft  3    10      
3   Australian Grand Prix   alonso  2.0     Super soft  20   13        
4   Australian Grand Prix   alonso  2.0     Super soft  21   13  
5   Australian Grand Prix   alonso  2.0     Super soft  22   13  
6   Bahrain Grand Prix   ham     1.0     Super soft  1    5      
7   Bahrain Grand Prix   vettel  1.0     Super soft  2    6       
8   Bahrain Grand Prix   bottas  1.0     Super soft  3    6      
9   Bahrain Grand Prix   alonso  2.0     Super soft  20   13        
10  Bahrain Grand Prix   alonso  2.0     Super soft  21   13  
11  Bahrain Grand Prix   alonso  2.0     Super soft  22   13 

1 个答案:

答案 0 :(得分:0)

我相信你需要:

s = df.groupby(['name','tyre'])['stint length'].transform(lambda x: x.mode().iat[0])
#alternative
#s=df.groupby(['name','tyre'])['stint length'].transform(lambda x:x.value_counts().index[0])

df = df[df['stint length'] == s]
print (df)
                     name driverRef  stint        tyre  lap  stint length
3   Australian Grand Prix    alonso    2.0  Super soft   20            13
4   Australian Grand Prix    alonso    2.0  Super soft   21            13
5   Australian Grand Prix    alonso    2.0  Super soft   22            13
9      Bahrain Grand Prix    alonso    2.0  Super soft   20            13
10     Bahrain Grand Prix    alonso    2.0  Super soft   21            13
11     Bahrain Grand Prix    alonso    2.0  Super soft   22            13