我正在尝试使用ddply在data.frame中解析多变量函数,以便检测每组的多变量异常值。我期望使用mvoutlier包的sign1函数的wfinal01值获得包含1(内点)和0(异常值)的向量或新列。以下代码是我尝试过的一个例子,没有成功:
library(plyr)
library(mvoutlier)
data(coffee)
myFunc<- function(X) sign1(unclass(X), qcrit=0.975)$wfinal01
ddply(coffee, .(sort), transform, outliers=myFunc(c(Metpyr, `5-Met`, furfu)))
返回以下错误消息。
Erreur dans apply(x, 2, mad) : dim(X) must have a positive length
答案 0 :(得分:3)
您的问题是c
创建了一个数字向量,您希望传递包含三列的矩阵。您可以使用cbind
执行此操作。
ddply(coffee, .(sort), transform, outliers=myFunc(cbind(Metpyr, `5-Met`, furfu)))
Metpyr X5.Met furfu sort outliers
1 12.50 8.51 6.20 arabica 0
2 5.33 11.80 17.80 arabica 1
3 2.56 7.16 13.67 arabica 0
4 8.59 8.40 14.39 arabica 1
5 8.22 14.86 20.35 arabica 1
6 7.73 12.23 21.02 arabica 1
7 6.07 12.60 14.25 arabica 1
8 5.88 11.19 15.39 arabica 1
9 10.34 11.90 9.81 arabica 1
10 6.26 10.49 16.90 arabica 1
11 5.47 15.04 24.87 arabica 1
12 1.39 12.76 19.51 arabica 1
13 5.10 13.42 16.93 arabica 1
14 3.72 12.65 21.35 arabica 1
15 4.33 12.72 18.47 arabica 1
16 7.38 15.00 21.58 arabica 1
17 12.13 11.68 15.59 blended 1
18 14.41 8.99 16.42 blended 1
19 8.86 6.98 8.40 blended 1
20 15.47 5.89 5.37 blended 1
21 7.55 13.74 22.26 blended 1
22 14.47 8.76 11.28 blended 1
23 11.34 12.62 14.15 blended 1
24 14.25 8.02 8.69 blended 1
25 6.85 13.38 23.83 blended 1
26 9.93 9.05 7.52 blended 1
27 8.59 14.29 18.50 blended 1
向量只有1维,apply
需要一个矩阵或大于2维的数组(因此错误)
编辑 - 按列引用
我认为按列号引用是危险的,但如果您使用data.table
data.table
将比ddply
更快,更高效。
library(data.table)
CD <- data.table(coffee)
CD[, outlier := sign1(.SD, qcrit = 0.975)$wfinal01,by = sort, .SDcols = 1:3]
CD
Metpyr 5-Met furfu sort outlier
1: 12.50 8.51 6.20 arabica 0
2: 5.33 11.80 17.80 arabica 1
3: 2.56 7.16 13.67 arabica 0
4: 8.59 8.40 14.39 arabica 1
5: 8.22 14.86 20.35 arabica 1
6: 7.73 12.23 21.02 arabica 1
7: 6.07 12.60 14.25 arabica 1
8: 5.88 11.19 15.39 arabica 1
9: 10.34 11.90 9.81 arabica 1
10: 6.26 10.49 16.90 arabica 1
11: 5.47 15.04 24.87 arabica 1
12: 1.39 12.76 19.51 arabica 1
13: 5.10 13.42 16.93 arabica 1
14: 3.72 12.65 21.35 arabica 1
15: 4.33 12.72 18.47 arabica 1
16: 7.38 15.00 21.58 arabica 1
17: 12.13 11.68 15.59 blended 1
18: 14.41 8.99 16.42 blended 1
19: 8.86 6.98 8.40 blended 1
20: 15.47 5.89 5.37 blended 1
21: 7.55 13.74 22.26 blended 1
22: 14.47 8.76 11.28 blended 1
23: 11.34 12.62 14.15 blended 1
24: 14.25 8.02 8.69 blended 1
25: 6.85 13.38 23.83 blended 1
26: 9.93 9.05 7.52 blended 1
27: 8.59 14.29 18.50 blended 1
Metpyr 5-Met furfu sort outlier
您可以轻松(并且更明确地)将c('Metpyr', `5-Met`, 'furfu')
作为参数传递给.SDcols。