我想使用dplyr将表分组为一列,然后将函数应用于每组第二列中的值集。
例如,在下面的代码示例中,我想返回每个人吃的所有2项食物组合。我无法弄清楚如何在do()
函数中正确提供具有正确列(食物)的函数。
library(dplyr)
person = c( 'Grace', 'Grace', 'Grace', 'Rob', 'Rob', 'Rob' )
foods = c( 'apple', 'banana', 'cucumber', 'spaghetti', 'cucumber', 'banana' )
eaten = data.frame(person, foods)
by_person = group_by(eaten, person)
# How to do this?
do( by_person, combn( x = foods, m = 2 ) )
请注意?do
中的示例代码在我的计算机上失败
mods <- do(carriers, failwith(NULL, lm), formula = ArrDelay ~ date)
答案 0 :(得分:14)
让我们像这样定义eaten
:
eaten <- data.frame(person, foods, stringsAsFactors = FALSE)
1)然后试试这个:
eaten %.% group_by(person) %.% do(function(x) combn(x$foods, m = 2))
,并提供:
[[1]]
[,1] [,2] [,3]
[1,] "apple" "apple" "banana"
[2,] "banana" "cucumber" "cucumber"
[[2]]
[,1] [,2] [,3]
[1,] "spaghetti" "spaghetti" "cucumber"
[2,] "cucumber" "banana" "banana"
2)为了能够在评论中找到@Hadley所描述的内容而不等待dplyr的未来版本,请尝试找到do2
here :
library(gsubfn)
eaten %.% group_by(person) %.% fn$do2(~ combn(.$foods, m = 2))
,并提供:
$Grace
[,1] [,2] [,3]
[1,] "apple" "apple" "banana"
[2,] "banana" "cucumber" "cucumber"
$Rob
[,1] [,2] [,3]
[1,] "spaghetti" "spaghetti" "cucumber"
[2,] "cucumber" "banana" "banana"
注意:在帮助文件中提供代码的问题的最后一行也对我失败了。这种变化对我有用:do(jan, lm, formula = ArrDelay ~ date)
。