我有一个R数据帧,其中3列包含值0或1。当值是1且由'&'分隔时,我需要创建一列作为列名称的串联。以下代码使用空格''作为分隔符,但当我将其更改为'&'时失败。
代码:
A = c(1,0,1,0,0,1)
B = c(1,1,1,0,1,0)
C = c(0,0,0,1,1,1)
data = data.frame(A, B, C)
data$New = paste(ifelse(data$A == 1, "A", ""),
ifelse(data$B == 1, "B", ""),
ifelse(data$C == 1, "C", ""), sep = '')
data
输出:
A B C New
1 1 1 0 AB
2 0 1 0 B
3 1 1 0 AB
4 0 0 1 C
5 0 1 1 BC
6 1 0 1 AC
带有'&'分隔符的代码和输出:
A = c(1,0,1,0,0,1)
B = c(1,1,1,0,1,0)
C = c(0,0,0,1,1,1)
data = data.frame(A, B, C)
data$New = paste(ifelse(data$A == 1, "A", ""),
ifelse(data$B == 1, "B", ""),
ifelse(data$C == 1, "C", ""), sep = '&')
data
A B C New
1 1 1 0 A&B&
2 0 1 0 &B&
3 1 1 0 A&B&
4 0 0 1 &&C
5 0 1 1 &B&C
6 1 0 1 A&&C
预期输出:
A B C New
1 1 1 0 A&B
2 0 1 0 B
3 1 1 0 A&B
4 0 0 1 C
5 0 1 1 B&C
6 1 0 1 A&C
ifelse
条件的情况下做到这一点?答案 0 :(得分:3)
我们可以通过遍历行来names
的子集
data$New <- apply(data[1:3], 1, function(x) paste(names(x[x!=0]), collapse="&"))
data$New
#[1] "A&B" "B" "A&B" "C" "B&C" "A&C"
也可以按列进行
library(tidyverse)
data[1:3] %>%
na_if(0) %>%
`*`(col(.)) %>%
imap(~ rep(.y, length(.x))[.x]) %>%
reduce(paste, sep= "&") %>%
str_remove("(NA&)+|(&NA)+") %>%
str_remove("&NA")
#[1] "A&B" "B" "A&B" "C" "B&C" "A&C"
答案 1 :(得分:3)
您可以将apply
与paste
一起使用。
nms <- names(data)
data$New <- apply(data, 1, function(x){
paste(nms[as.logical(x)], collapse = "&")
})
data
# A B C New
#1 1 1 0 A&B
#2 0 1 0 B
#3 1 1 0 A&B
#4 0 0 1 C
#5 0 1 1 B&C
#6 1 0 1 A&C
答案 2 :(得分:2)
使用which
和arr.ind = TRUE
,然后使用aggregate
:
cbind(data,
new = aggregate(col ~ row, data = which(data == 1, arr.ind = TRUE),
function(x) paste(names(data)[x], collapse = "&"))[ , "col"])
# A B C new
# 1 1 1 0 A&B
# 2 0 1 0 B
# 3 1 1 0 A&B
# 4 0 0 1 C
# 5 0 1 1 B&C
# 6 1 0 1 A&C
类似,使用tapply
:
ix <- which(data == 1, arr.ind = TRUE)
cbind(data,
new = tapply(ix[ , "col"], ix[ , "row"],
function(x) paste(names(data)[x], collapse = "&")))