我正在尝试将函数(weight.func)传递给调用ddply的不同函数(包装器)。我希望ddply使用该函数(weight.func)作为其计算的一部分。当weight.func设置为'global'时,我得到了我想要的输出,但是当它作为匿名函数传递给包装器时却没有。
我可以让ddply做我想做的事吗?这是一个代码示例:
> print(sampleData)
studentId problem part workerId rating
1 8001 problem26 partA A127R5QI5OGBIK 0.0
2 8001 problem26 partA A1FCLYRBAB430F 0.0
3 8001 problem26 partA A25FZQY34C6RVO 0.0
4 8001 problem26 partA A3G0MO562MHMZ3 0.5
5 8001 problem26 partA A3RB9ZOIUC3NWG 2.0
6 8001 problem26 partB A1FCLYRBAB430F 0.5
7 8001 problem26 partB A1XRDZKSJBWY8Q 0.5
8 8001 problem26 partB A22CRWMZUX7FFR 0.5
9 8001 problem26 partB A25FZQY34C6RVO 1.0
10 8001 problem26 partB A3G0MO562MHMZ3 0.5
11 8001 problem27 partA A1ET309DW6M2XA 2.0
12 8001 problem27 partA A1FCLYRBAB430F 0.0
13 8001 problem27 partA A22CRWMZUX7FFR 0.0
14 8001 problem27 partA A25FZQY34C6RVO 0.0
15 8001 problem27 partA A3G0MO562MHMZ3 0.0
16 8001 problem27 partB A1FCLYRBAB430F 1.0
17 8001 problem27 partB A22CRWMZUX7FFR 0.0
18 8001 problem27 partB A25FZQY34C6RVO 0.0
19 8001 problem27 partB A2U9676210WST5 0.0
20 8001 problem27 partB A3G0MO562MHMZ3 0.0
21 8002 problem26 partA A127R5QI5OGBIK 0.0
22 8002 problem26 partA A1FCLYRBAB430F 0.5
23 8002 problem26 partA A22CRWMZUX7FFR 0.0
24 8002 problem26 partA A25FZQY34C6RVO 2.0
25 8002 problem26 partA A3G0MO562MHMZ3 0.5
26 8002 problem26 partB A17EHJZNJGNRAN 2.0
27 8002 problem26 partB A1FCLYRBAB430F 0.0
28 8002 problem26 partB A2IPRDTE6B4TAB 0.0
29 8002 problem26 partB A3G0MO562MHMZ3 0.0
30 8002 problem26 partB A6SON3OS15XKA 0.0
31 8002 problem27 partA A1FCLYRBAB430F 0.0
32 8002 problem27 partA A25FZQY34C6RVO 0.0
33 8002 problem27 partA A2IPRDTE6B4TAB 0.0
34 8002 problem27 partA A2U9676210WST5 0.0
35 8002 problem27 partA A3G0MO562MHMZ3 0.0
36 8002 problem27 partB A1FCLYRBAB430F 0.0
37 8002 problem27 partB A1V52SSKROBV8E 2.0
38 8002 problem27 partB A25FZQY34C6RVO 2.0
39 8002 problem27 partB A2IPRDTE6B4TAB 0.0
40 8002 problem27 partB A3G0MO562MHMZ3 0.0
>
> #Make a wrapper
> wrapper <- function ( ratingData, weight.func ) {
+ print(weight.func) #prove that the function is being passed
+ ddply(ratingData, c('studentId','problem','part'), summarize,
+ sum.weights = sum ( weight.func(rating) ))
+ }
> wrapper( sampleData, weight.func=function(x) (x+.001)^-1 )
function(x) (x+.001)^-1
Error in data.frame(sum.weights = sum(weight.func(rating))) :
could not find function "weight.func"
>
> #'globally' declare weight.func
> weight.func <- function(x) (x+.001)^-1
> wrapper( sampleData, weight.func=NULL )
NULL
studentId problem part sum.weights
1 8001 problem26 partA 3002.495758
2 8001 problem26 partB 8.983033
3 8001 problem27 partA 4000.499750
4 8001 problem27 partB 4000.999001
5 8002 problem26 partA 2004.491766
6 8002 problem26 partB 4000.499750
7 8002 problem27 partA 5000.000000
8 8002 problem27 partB 3000.999500
第二个输出是目标。任何帮助赞赏! (包括基于非plyr的方式来完成相同的任务。)
以上示例是一个玩具示例。这是我可以重现行为的最简单的情况。
答案 0 :(得分:2)
你可以使用聚合:
w2 <- function(d, f){
aggregate(rating~studentId+problem+part, function(x)sum(f(x)), data=d)
}
w2( sampleData, function(x) (x+.001)^-1 )
请注意,聚合列的名称是自动确定的,因此如果您想要命名,则需要自己完成。
你可以通过ddply完成同样的事情而不用总结
wrapper <- function ( ratingData, weight.func ) {
ddply(ratingData, c('studentId','problem','part'), function(x)c(sum.weights=sum(weight.func(x$rating))))
}
wrapper( sampleData, weight.func=function(x) (x+.001)^-1 )
在这种情况下,您可以在函数内指定名称。
答案 1 :(得分:2)
这是plyr中的已知错误:https://github.com/hadley/plyr/issues#issue/3
答案 2 :(得分:0)
我不确定我做了哪些更改(在“sum”之后取出空格或将NULL更改为实际函数或&lt;&lt;&gt;&gt;),但现在可以正常工作:
wrapper <- function ( ratingData, weight.func=weight.func) {
ddply(ratingData, .variables=c('studentId','problem','part'),
.fun=summarise, sum.weights = sum(weight.func(rating) ))
}
wrapper( sampleData, weight.func=weight.func )
studentId problem part sum.weights
1 8001 problem26 partA 3002.495758
2 8001 problem26 partB 8.983033
3 8001 problem27 partA 4000.499750
4 8001 problem27 partB 4000.999001
5 8002 problem26 partA 2004.491766
6 8002 problem26 partB 4000.499750
7 8002 problem27 partA 5000.000000
8 8002 problem27 partB 3000.999500
答案 3 :(得分:0)
plyr(https://github.com/hadley/plyr/issues/3)中有关此问题的更新:
在plyr中使用'here'功能,只需将'summarize'替换为'here(summarize)'即可访问调用ddply的环境。
wrapper <- function(ratingData, weight.func){
ddply(ratingData, c('studentId','problem','part'),
here(summarize), # here(summarize)!
sum.weights = sum(weight.func(rating))
)
}