在r中的拆分组内创建组合

时间:2016-04-12 16:22:49

标签: r split combn

根据位置,天数和数量下方的数据框,我正在寻找一种解决方案,以便在每一天按位置创建数量组合。在生产中,这些组合可能会变得非常大,因此可以理解data.table或plyr方法。

library(gtools)    
dat <- data.frame(Loc = c(51,51,51,51,51), Day = c("Mon","Mon","Tue","Tue","Wed"),
Qty = c(1,2,3,4,5))

此示例的输出应为:

  Loc Day  Qty
1  51 Mon   1
2  51 Tue   3
3  51 Wed   5

4  51 Mon   1
5  51 Tue   4
6  51 Wed   5

7  51 Mon   2
8  51 Tue   3
9  51 Wed   5

10  51 Mon  2
11  51 Tue  4
12  51 Wed  5

我尝试了几个嵌套的lapply让我接近,但后来我不知道如何将它带到下一步并在每个商店中使用combn()函数。

lapply(split(dat, dat$Loc), function(x) {
      lapply(split(x, x$Day), function(y) {
          y$Qty
    })                
})

如果每个商店&gt;我都能获得正确的组合。 Day group在它自己的列表中,但我正在努力如何使用split-apply-combine方法从数据框到达那里。

loc51_mon <- c(1,2)
loc51_tue <- c(3,4)
loc51_wed <- c(5)

unlist(lapply(loc51_mon, function(x) {
    lapply(loc51_tue, function(y) {
         lapply(loc51_wed, function(z) {
              combn(c(x,y,z), 3)
         })
    })
}), recursive = FALSE)

[[1]]
[[1]][[1]]
     [,1]
[1,]    1
[2,]    3
[3,]    5

[[2]]
[[2]][[1]]
     [,1]
[1,]    1
[2,]    4
[3,]    5

[[3]]
[[3]][[1]]
     [,1]
[1,]    2
[2,]    3
[3,]    5

[[4]]
[[4]][[1]]
     [,1]
[1,]    2
[2,]    4
[3,]    5

2 个答案:

答案 0 :(得分:2)

这应该有效,但是进一步的复杂性需要改变功能:

library(data.table) 
dat <- data.frame(Loc = c(51,51,51,51,51), Day = c("Mon","Mon","Tue","Tue","Wed"),
                  Qty = c(1,2,3,4,5), stringsAsFactors = F)
setDT(dat)

comb_in <- function(Qty_In,Day_In){
    temp_df <- aggregate(Qty_In ~ Day_In, cbind(Qty_In, as.character(Day_In)), paste, collapse = "|")
    temp_list <- strsplit(temp_df$Qty_In, split = "|", fixed = T)
    names(temp_list) <- as.character(temp_df$Day)
    melt(as.data.table(expand.grid(temp_list))[, case_group := .I], id.vars = "case_group", variable.name = "Day", value.name = "Qty")
}

dat[, comb_in(Qty_In = Qty, Day_In = Day), by = Loc][order(Loc,case_group,Day)]
    Loc case_group Day Qty
 1:  51          1 Mon   1
 2:  51          1 Tue   3
 3:  51          1 Wed   5
 4:  51          2 Mon   2
 5:  51          2 Tue   3
 6:  51          2 Wed   5
 7:  51          3 Mon   1
 8:  51          3 Tue   4
 9:  51          3 Wed   5
10:  51          4 Mon   2
11:  51          4 Tue   4
12:  51          4 Wed   5

您现在可以按case_group过滤以获得每个组合

答案 1 :(得分:0)

这个问题与How to expand.grid on vectors sets rather than single elements

非常相似

用于一般方法(性能可能比指定问题的方法慢):

permu.sets <- function(listoflist) {
    #assumes that each list within listoflist contains vectors of equal lengths
    temp <- expand.grid(listoflist)   
    do.call(cbind, lapply(temp, function(x) do.call(rbind, x)))
} #permu.sets

#for the problem posted in OP
dat <- data.frame(Loc = c(51,51,51,51,51), Day = c("Mon","Mon","Tue","Tue","Wed"),
    Qty = c(1,2,3,4,5))
vecsets <- lapply(split(dat, dat$Day), function(x) split(as.matrix(x), row(x)))
res <- permu.sets(vecsets)
lapply(split(res, seq(nrow(res))), function(x) matrix(x, ncol=3, byrow=T ))