如何从向量列表中删除异常值?

时间:2013-07-25 17:22:57

标签: r lapply outliers

我有这个载体列表:

tdatm.sp=structure(list(X3CO = c(24.88993835, 25.02366257, 24.90308762
), X3CS = c(25.70629883, 25.26747704, 25.1953907), X3CD = c(26.95723343, 
26.84725571, 26.2314415), X3CSD = c(36.95250702, 36.040905, 36.90475845
), X5CO = c(25.44123077, 24.97585869, 24.86075592), X5CS = c(25.71570396, 
26.10244179, 25.39032555), X5CD = c(27.67508507, 27.18985558, 
26.93682098), X5CSD = c(36.26528549, 34.88553238, 33.97910309
), X7CO = c(24.7142601, 24.08443642, 23.97057915), X7CS = c(24.55734444, 
24.56562042, 24.7589817), X7CD = c(27.14260101, 26.65704346, 
26.49533081), X7CSD = c(33.89881897, 32.91091919, 32.79199219
), X9CO = c(26.86141014, 26.42648888, 25.8350563), X9CS = c(28.17367744, 
27.27400589, 26.58813667), X9CD = c(28.88915062, 28.32597542, 
28.2713623), X9CSD = c(34.61352158, 35.84189987, 35.80329132)), .Names = c("X3CO", 
"X3CS", "X3CD", "X3CSD", "X5CO", "X5CS", "X5CD", "X5CSD", "X7CO", 
"X7CS", "X7CD", "X7CSD", "X9CO", "X9CS", "X9CD", "X9CSD"))

> head(tdatm.sp)
$X3CO
[1] 24.88994 25.02366 24.90309

$X3CS
[1] 25.70630 25.26748 25.19539

$X3CD
[1] 26.95723 26.84726 26.23144

$X3CSD
[1] 36.95251 36.04091 36.90476

$X5CO
[1] 25.44123 24.97586 24.86076

$X5CS
[1] 25.71570 26.10244 25.39033

我想使用Hampel方法从每个单独的向量中删除异常值。

我发现这样做的一种方法是:

repoutliers=function(x){ med=median(x); mad=mad(x); x[x>med+3*mad | x<med-3*mad]=NA; return(x)}
lapply(tdatm.sp, repoutliers)

但是我想知道是否可以直接在lapply中声明一个新功能。 lapply将每个单独的向量发送给函数repoutliers,你知道如何直接在lapply中对这些单独的向量进行操作吗?假设我用“替换”函数交换repoutliers,我可以通过调用replace的参数中的单个向量来做相同的单词(lapply(X,FUN,...); ... =替换参数)。

简而言之:如何操纵单个向量lapply发送给winthin lapply函数?

1 个答案:

答案 0 :(得分:2)

这或多或少是一种番茄tomahtoe的东西。在lapply中完成这一切并不会让你走得太远。

lapply( tdatm.sp, function(x){ 
    med=median(x)
    mad=mad(x)
    x[x>med+3*mad | x<med-3*mad]=NA
    return(x)} )

现在lapply只是将所有内容发送到匿名函数。但如果你之后不希望这个函数闲置,这就是方便的语法。