我已经处理了复杂的数据,并按照如下所示的Datetime对值进行了排序,从而创建了一个数据框
Tme Vtc Stc
DateTime
2012-07-01 00:00:00.000000000 0.00000 7.06 4.51
2012-07-01 00:00:00.000000000 0.00000 5.47 1.11
2012-07-01 00:00:00.000000000 0.00000 5.81 2.32
2012-07-01 00:00:00.000000000 0.00000 7.20 7.65
2012-07-01 00:00:00.000000000 0.00000 11.63 18.30
2012-07-01 00:00:00.000000000 0.00000 4.97 0.58
2012-07-01 00:00:00.000000000 0.00000 4.93 0.37
2012-07-01 00:00:00.000000000 0.00000 4.78 0.19
2012-07-01 00:00:00.000000000 0.00000 5.62 2.13
2012-07-01 00:00:00.000000000 0.00000 6.07 1.67
2012-07-01 00:00:00.000000000 0.00000 7.29 6.21
2012-07-01 00:00:29.980799999 0.00833 4.97 0.58
2012-07-01 00:00:29.980799999 0.00833 5.62 2.13
2012-07-01 00:00:29.980799999 0.00833 7.19 7.63
2012-07-01 00:00:29.980799999 0.00833 11.63 18.33
2012-07-01 00:00:29.980799999 0.00833 6.07 1.67
2012-07-01 00:00:29.980799999 0.00833 4.77 0.19
2012-07-01 00:00:29.980799999 0.00833 4.94 0.38
2012-07-01 00:00:29.980799999 0.00833 7.07 4.54
2012-07-01 00:00:29.980799999 0.00833 5.82 2.34
2012-07-01 00:00:29.980799999 0.00833 5.47 1.11
2012-07-01 00:00:29.980799999 0.00833 7.28 6.15
我的目的是每次均化数据并每3分钟重新采样一次。我可以使用以下代码实现这一点
m3Reample2 = plotVTEC.resample('3T').agg(dict(Tme='first', Vtc='mean', Stc='mean'))
,结果数据框如下所示:
Vtc Tme Stc
DateTime
2012-07-01 00:00:00 6.433377 0.00000 4.083636
2012-07-01 00:03:00 6.427455 0.05833 4.085455
2012-07-01 00:06:00 6.428182 0.10000 4.104091
2012-07-01 00:09:00 6.431169 0.15000 4.141039
2012-07-01 00:12:00 6.453818 0.20833 4.233636
2012-07-01 00:15:00 6.484697 0.25000 4.350758
2012-07-01 00:18:00 6.544416 0.30000 4.566623
2012-07-01 00:21:00 6.564231 0.35833 4.580385
2012-07-01 00:24:00 6.459677 0.40000 4.289355
2012-07-01 00:27:00 6.450649 0.45000 4.379091
2012-07-01 00:30:00 6.482727 0.50833 4.515455
2012-07-01 00:33:00 6.501061 0.55000 4.620758
2012-07-01 00:36:00 6.677857 0.60000 5.182738
2012-07-01 00:39:00 6.632500 0.65833 5.084167
2012-07-01 00:42:00 6.598194 0.70000 5.015417
2012-07-01 00:45:00 6.537738 0.75000 4.885595
2012-07-01 00:48:00 6.482000 0.80833 4.772333
2012-07-01 00:51:00 6.424861 0.85000 4.641111
2012-07-01 00:54:00 6.343333 0.90000 4.459286
2012-07-01 00:57:00 6.286167 0.95833 4.334167
2012-07-01 01:00:00 6.230139 1.00000 4.213472
这里的问题是,对于日期时间来说,例如2012-07-01 00:00:00,11.63是一个离群值,它影响了该日期时间的平均值
如何从平均值计算中排除这一点?