Python Pandas Dataframe按功能聚合组

时间:2014-12-08 08:59:11

标签: python pandas group-by dataframe outliers

我有一个功能来检测和删除实验数据中的异常值。 我想对我的数据应用此函数,存储在数据框中。然而,数据框由许多实验对象和4个实验条件组成,而离群检测功能应该应用于关卡和每个主题+试验代码。 这就是我的数据:

                  subject     trialcode    correct  latency
           0    1790361018         nonsn        1     4051
           1    1790361018     neighbour        1     1266
           2    1790361018     neighbour        1     2145
           3    1790361018         nonsn        0     2959
           4    1790361018  nonneighbour        1     1086
           5    1790361018      nonwords        1     2956
           6    1790361018      nonwords        1     3814
           7    1790361018  nonneighbour        1     4924
           8    1790361018      nonwords        0     4771
           9    1790361018  nonneighbour        0     2654
           10   1790361018     neighbour        1      945
           11   1790361018  nonneighbour        1     1189
           12   1790361018     neighbour        1     1215
           13   1790361018     neighbour        1      800
           14   1790361018     neighbour        1      752
           15   1790361018     neighbour        1      963
           16   1790361018     neighbour        1     1822
           17   1790361018  nonneighbour        1      856
           18   1790361018  nonneighbour        1      695
           19   1790361018      nonwords        1     2020
           20   1790361018     neighbour        1     1303
           21   1790361018  nonneighbour        1     1597
           22   1790361018      nonwords        1     1327
           23   1790361018     neighbour        1     1084
           24   1790361018     neighbour        1     2434
           25   1790361018  nonneighbour        1      917
           26   1790361018     neighbour        1     1170
           27   1790361018      nonwords        0     1388
           28   1790361018      nonwords        1     1871
           29   1790361018     neighbour        1      967

这是我们的功能:

 def reject_outliers(data, m=2):
    return data[abs(data - np.mean(data)) < m * np.std(data)]     

有没有办法在主题+试用代码组上应用此功能?

0 个答案:

没有答案