我试图弄清楚如何使用一些分隔符将多个列粘贴在一起,但我想将这些列组合在一起,比如说2.例如,我有一个数据帧{{1}如下:
df
我想分别合并df <- data.frame(matrix(ncol = 4, nrow = 3))
x <- c("a", "b", "c", "d")
colnames(df) <- x
df$a <- c("man", "bear", "pig")
df$b <- c("chicken", "moose", "bear")
df$c <- c("fish", "dog", "bear")
df$d <- c("dog", "mouse", "moose")
df
# a b c d
#1 man chicken fish dog
#2 bear moose dog mouse
#3 pig bear bear moose
和a + b
列。我可以通过以下步骤将它粘贴在一起来实现它:
c + d
但我真的想坚持DRY原则来获得更清晰的代码。我尝试用df$combined1 <- paste(df$a, df$b, sep = " + ")
df$combined2 <- paste(df$c, df$d, sep = " + ")
做,但没有运气。有什么想法吗?
感谢您的帮助!
答案 0 :(得分:2)
首先,本着可读性的精神,让我们简化您的数据创建代码。绝对不需要所有这些中间变量:
df <- data.frame(
a = c("man", "bear", "pig"),
b = c("chicken", "moose", "bear"),
c = c("fish", "dog", "bear"),
d = c("dog", "mouse", "moose")
)
现在回答你的问题。这很一般。首先,我们定义要组合的列的列表,然后我们将它们组合在一起,创建组合列的名称,并仅通过引用数据和组合列表来组合它们:
cols_to_combine = list(c(1, 2), c(3, 4))
for (comb in cols_to_combine) {
df[[paste0("combined_", paste(comb, collapse = "_"))]] =
do.call(paste, args = c(df[comb], sep = " + "))
}
df
# a b c d combined_1_2 combined_3_4
# 1 man chicken fish dog man + chicken fish + dog
# 2 bear moose dog mouse bear + moose dog + mouse
# 3 pig bear bear moose pig + bear bear + moose