如何选择总和为n的元素?

时间:2015-10-25 12:31:13

标签: r

说我有以下数据框

x <- c("p1","p2","p3","p4","p5","p6","p7","p8","p9","p10")
y <- c(1,4,3,5,5,7,2,2,6,8)
df <- data.frame(x,y)

x代表玩家,y代表目标。我想要所有目标总和为10的玩家子集,比如说

{p1,p3,p9},{p3,p6},{p7,p8,p9}...

3 个答案:

答案 0 :(得分:5)

1)lpSolve 这可以使用integert线性编程来完成。我们使用c(0,...,0)的目标和由y组成的一行矩阵作为约束矩阵。约束的右侧必须等于n,即10。

library(lpSolve)
y <- c(1,4,3,5,5,7,2,2,6,8)
n <- length(y)
k <- sum(cumsum(sort(y)) <= n) + 1 # upper bound to no of players in group
out <- lp(objective = numeric(n), 
   const.mat = matrix(y, 1), const.dir = "==", const.rhs = n,
   all.bin = TRUE, num.bin.solns = sum(choose(n, 1:k)))

# solution vector seems to have junk at end so truncate it and reshape to matrix
soln <- matrix(head(out$solution, n * out$num.bin.solns), n)

共找到19个解决方案:

> out
Success: the objective function is 0 
         19 solutions returned

> out$num.bin.solns
[1] 19

> dim(soln)
[1] 10 19

soln的列是可行的解决方案。例如,第一个solun是玩家1,2和4:

> soln[, 1]
 [1] 1 1 0 1 0 0 0 0 0 0
> which(soln[, 1]==1)
[1] 1 2 4

我们可以将解决方案列为如下字符串:

> x <- c("p1","p2","p3","p4","p5","p6","p7","p8","p9","p10")
> apply(soln == 1, 2, function(v) toString(x[v]))
 [1] "p1, p2, p4"     "p4, p5"         "p3, p4, p7"     "p1, p4, p7, p8"
 [5] "p1, p2, p3, p8" "p1, p2, p3, p7" "p1, p3, p9"     "p3, p4, p8"    
 [9] "p1, p2, p5"     "p3, p5, p7"     "p2, p9"         "p3, p5, p8"    
[13] "p3, p6"         "p1, p5, p7, p8" "p1, p6, p7"     "p1, p6, p8"    
[17] "p7, p8, p9"     "p8, p10"        "p7, p10"  

2)wle 第二种方法是创建1:10的所有10 ^ 2子集作为二进制向量v,然后选择那些y %*% v == 10(其中y的子集来自问题)。这种方法可以生成简洁的代码,只要y不太长就可以。

library(wle)
m <- sapply(0:(2^10-1), function(x) binary(x, 10)$binary)
soln2 <- m[, y %*% m == 10]

如果首选表格,请使用与(1)中相同的方法将其转换为字符串向量。

更新:一些更正和改进,并添加了(2)。

答案 1 :(得分:2)

您可以使用此强力方法: 结果将是:

 [1] "p2,p9"       "p3,p6"       "p4,p5"       "p7,p10"      "p8,p10"      "p1,p2,p4"    "p1,p2,p5"   
 [8] "p1,p3,p9"    "p1,p6,p7"    "p1,p6,p8"    "p3,p4,p7"    "p3,p4,p8"    "p3,p5,p7"    "p3,p5,p8"   
[15] "p7,p8,p9"    "p1,p2,p3,p7" "p1,p2,p3,p8" "p1,p4,p7,p8" "p1,p5,p7,p8"

数据:

x<-c("p1","p2","p3","p4","p5","p6","p7","p8","p9","p10")
y<-c(1,4,3,5,5,7,2,2,6,8)
df<-data.frame(x=x,y=y, stringsAsFactors = FALSE)
df$id <- seq_len(nrow(df)) # Adding an ID column

获取最多max_comb个元素

的所有可能组合
max_comb <- nrow(df)
my_combn <- function(m, x){
  combn(x, m, simplify = FALSE)
}
dat <- lapply(1:max_comb, my_combn, df$id)

将名称设置为选定的组合

combn_names <- function(ind, vec, collapse = ", "){
  paste(vec[ind], collapse = collapse)
}

set_list_combn_names <- function(l, vec){
  setNames(l, lapply(l, combn_names, vec = vec))
}
dat <- lapply(dat, set_list_combn_names, df$x)

检查sum是否等于x=10并输出组合名称

sum_equal_x <- function(ind, vec, x){
  sum(vec[ind]) == x
}
names(which(unlist(lapply(dat, lapply, sum_equal_x, df$y, 10))))

结果:

> names(which(unlist(lapply(dat, lapply, sum_equal_x, df$y, 10))))
 [1] "p2,p9"       "p3,p6"       "p4,p5"       "p7,p10"      "p8,p10"      "p1,p2,p4"    "p1,p2,p5"   
 [8] "p1,p3,p9"    "p1,p6,p7"    "p1,p6,p8"    "p3,p4,p7"    "p3,p4,p8"    "p3,p5,p7"    "p3,p5,p8"   
[15] "p7,p8,p9"    "p1,p2,p3,p7" "p1,p2,p3,p8" "p1,p4,p7,p8" "p1,p5,p7,p8"

答案 2 :(得分:1)

你可以这样做:

require(utils)

x<-c("p1","p2","p3","p4","p5","p6","p7","p8","p9","p10")
y<-c(1,4,3,5,5,7,2,2,6,8)
df<-data.frame(cbind(x=x,y=y))

search.val <- 10
max.num    <- length(x)

all.comb <- lapply(1:max.num, function(n){ combn(x,n) })

# Calcualte sum
#   Foreach combination length 1:n
sums <- lapply(all.comb, function(comb.mat){
  # Foreach combination of length n
  apply(comb.mat,2,function(col){
    sum(as.numeric( df[which(df$x %in% col),]$y ))
  })
})

# Find which combinations have sum 10
vals <- lapply(1:max.num,function(i){
  sum.vect <- sums[[i]]
  inds     <- which(sum.vect == search.val)
  lapply(inds, function(j){
    all.comb[[i]][,j]
  })
})

sum.of.10 <- unlist(vals,recursive=FALSE)

这是一个强力解决方案,其中使用max.num函数计算所有组合upp到长度combn的总和。