R

时间:2016-07-06 10:17:22

标签: r combinations

我对组合的组合有疑问。

我的迷你样本看起来像这样:

sample <- data.frame(
  group=c("a","a","a","a","b","b","b"),
  number=c(1,2,3,2,4,5,3)
)

如果我将combn的函数应用于数据框,它会给出以下结果,即“数字”列下的值的所有组合,无论该值属于哪个组:

         [,1] [,2]
   [1,]    1    2
   [2,]    1    3
   [3,]    1    2
   [4,]    1    4
   [5,]    1    5
   [6,]    1    3
   [7,]    2    3
   [8,]    2    2
   [9,]    2    4
  [10,]    2    5
  [11,]    2    3
  [12,]    3    2
  [13,]    3    4
  [14,]    3    5
  [15,]    3    3
  [16,]    2    4
  [17,]    2    5
  [18,]    2    3
  [19,]    4    5
  [20,]    4    3
  [21,]    5    3

我用于上述结果的代码如下:

t(combn((sample$number), 2))

但是,我想在组内得到组合结果(即“a”,“b”)。因此,我想得到的结果应该是这样的:

     [,1] [,2] [,3]
[1,]   a    1    2
[2,]   a    1    3
[3,]   a    1    2
[4,]   a    2    3
[5,]   a    2    2
[6,]   a    3    2
[7,]   b    4    5
[8,]   b    4    3
[9,]   b    5    3

除了组合,我想得到列指示 小组。

2 个答案:

答案 0 :(得分:4)

我们可以使用data.table

按功能分组
library(data.table)
setDT(sample)[, {i1 <-  combn(number, 2)
                   list(i1[1,], i1[2,]) }, by =  group]
#    group V1 V2
#1:     a  1  2
#2:     a  1  3
#3:     a  1  2
#4:     a  2  3
#5:     a  2  2
#6:     a  3  2
#7:     b  4  5
#8:     b  4  3
#9:     b  5  3

或者紧凑的选项是

setDT(sample)[, transpose(combn(number, 2, FUN = list)), by = group]

或使用base R

 lst <- by(sample$number, sample$group, FUN = combn, m= 2)
 data.frame(group = rep(unique(as.character(sample$group)), 
                        sapply(lst, ncol)), t(do.call(cbind, lst)))

答案 1 :(得分:4)

这里是一个基本R选项,使用(1)split创建每个唯一组条目的data.frames列表,(2)lapply循环每个列表元素并计算使用combn,(3)do.call(rbind, ...)的组合将列表元素收集回单个data.frame

do.call(rbind, lapply(split(sample, sample$group), {
   function(x) data.frame(group = x$group[1], t(combn(x$number, 2)))
}))

#    group X1 X2
#a.1     a  1  2
#a.2     a  1  3
#a.3     a  1  2
#a.4     a  2  3
#a.5     a  2  2
#a.6     a  3  2
#b.1     b  4  5
#b.2     b  4  3
#b.3     b  5  3

还有一个dplyr选项:

library(dplyr)
sample %>% group_by(group) %>% do(data.frame(t(combn(.$number, 2))))
#Source: local data frame [9 x 3]
#Groups: group [2]
#
#   group    X1    X2
#  (fctr) (dbl) (dbl)
#1      a     1     2
#2      a     1     3
#3      a     1     2
#4      a     2     3
#5      a     2     2
#6      a     3     2
#7      b     4     5
#8      b     4     3
#9      b     5     3