通过在R中的组合中添加新字母来创建数据框

时间:2019-09-15 16:54:33

标签: r dataframe combinations

我必须创建一个具有字符组合的数据框。我有3个字符,例如x1,x2和x3。我之前的数据框的代码是-

gain<-do.call(rbind, lapply(2:3, function(x) {
   do.call(rbind, combn(3, x, function(y) {
    data.frame(A = paste(y, collapse = ""),
               B = paste(c("", y), collapse = "x"),
               stringsAsFactors = FALSE)
}, simplify = FALSE))
}))


 > gain
   A      B
1  12   x1x2
2  13   x1x3
3  23   x2x3
4 123 x1x2x3

现在的问题是,我必须在第二列中添加一个类似于“ x”的新字母“ b”,并且此组合用“ +”号分隔。我想要的输出是

> gain
   A      B
1  12   b1x1+b2x2
2  13   b1x1+b3x3
3  23   b2x2+b3x3
4 123 b1x1+b2x2+b3x3

我无法管理它。如果有人帮助我,我将不胜感激。

2 个答案:

答案 0 :(得分:2)

tmp = gsub("(x(\\d))", "b\\2\\1\\+", gain$B)
substring(tmp, 1, nchar(tmp) - 1)
#[1] "b1x1+b2x2"      "b1x1+b3x3"      "b2x2+b3x3"      "b1x1+b2x2+b3x3"

或者如果您想从A开始

tmp = gsub("(\\d)", "b\\1x\\1\\+", gain$A)
substring(tmp, 1, nchar(tmp) - 1)
#[1] "b1x1+b2x2"      "b1x1+b3x3"      "b2x2+b3x3"      "b1x1+b2x2+b3x3"

答案 1 :(得分:2)

带有strsplitpaste的选项

gain$B <- sapply(strsplit(gain$B, "(?<=\\d)(?=x)", perl = TRUE), 
   function(x) paste(paste0("b",
     unlist(regmatches(x, gregexpr("\\d+", x)))), x, collapse="+", sep=""))


gain$B
#[1] "b1x1+b2x2"      "b1x1+b3x3"      "b2x2+b3x3"      "b1x1+b2x2+b3x3"

或带有gsub的选项

gsub("(x)(\\d+)", "b\\2\\1\\2", gsub("(?<=\\d)(?=x)", "+", gain$B, perl = TRUE))
#[1] "b1x1+b2x2"      "b1x1+b3x3"      "b2x2+b3x3"      "b1x1+b2x2+b3x3"

数据

gain <- structure(list(A = c(12L, 13L, 23L, 123L), B = c("x1x2", "x1x3", 
"x2x3", "x1x2x3")), class = "data.frame", row.names = c("1", 
"2", "3", "4"))