Question

我正在使用R中的“by”函数来切断数据框并将函数应用于不同的部分，如下所示：

pairwise.compare <- function(x) {
Nright <- ...
Nwrong <- ...
Ntied <- ...
return(c(Nright=Nright, Nwrong=Nwrong, Ntied=Ntied))
}
Z.by <- by(rankings, INDICES=list(rankings$Rater, rankings$Class), FUN=pairwise.compare)

结果（Z.by）看起来像这样：

: 4 
: 357 
Nright Nwrong Ntied
     3      0     0
------------------------------------------------------------
: 8 
: 357 
NULL
------------------------------------------------------------
: 10 
: 470 
Nright Nwrong Ntied
     3      4     1 
------------------------------------------------------------ 
: 11 
: 470 
Nright Nwrong Ntied
    12      4     1

我想要的是将此结果转换为数据框（不存在NULL条目）所以它看起来像这样：

  Rater Class Nright Nwrong Ntied
1     4   357      3      0     0
2    10   470      3      4     1
3    11   470     12      4     1

我该怎么做？

Answer 1

by函数返回一个列表，因此您可以执行以下操作：

data.frame(do.call("rbind", by(x, column, mean)))

Answer 2

考虑在plyr包中使用ddply而不是使用。它处理将列添加到数据框的工作。

Answer 3

旧帖子，但是对于搜索此主题的任何人：

analysis = by(...)
data.frame(t(vapply(analysis,unlist,unlist(analysis[[1]]))))

unlist()将获取by()输出的元素（在本例中为analysis）并将其表示为命名向量。 vapply()会向analysis的所有元素取消列表并输出结果。它需要一个伪参数来知道输出类型，这就是analysis[[1]]的含义。如果可能，您可能需要添加一个检查表明分析不为空。每个输出都是一列，因此t()将其转换为所需的方向，每个分析条目都成为一行。

Answer 4

这扩展了Shane使用rbind（）的解决方案，但也添加了标识组的列并删除了NULL组 - 这两个功能在问题中被请求。通过使用基本包函数，不需要其他依赖项，例如，plyr。

simplify_by_output = function(by_output) {
    null_ind = unlist(lapply(by_output, is.null))  # by() returns NULL for combinations of grouping variables for which there are no data. rbind() ignores those, so you have to keep track of them.
    by_df = do.call(rbind, by_output)  # Combine the results into a data frame.
    return(cbind(expand.grid(dimnames(by_output))[!null_ind, ], by_df))  # Add columns identifying groups, discarding names of groups for which no data exist.
}

Answer 5

我愿意

x = by(data, list(data$x, data$y), function(d) whatever(d))
array(x, dim(x), dimnames(x))

将“by”对象转换为R中的数据框

5 个答案: