我有一个数据集,每个ID有多行,一列的指示符因行而异。
V1
我希望A
的所有行都替换为每个ID的V1
,前提是A
中至少有一个条目等于每个ID为ID V1
1 A
1 A
1 A
2 A
2 A
2 A
3 B
3 C
3 C
,否则保持原样。我正在寻找的输出如下:
{{1}}
谢谢!
答案 0 :(得分:0)
我目前最短的解决方案是创建一个中间色谱柱,然后放下它(我会看看是否可以将它装入单线中):
# group by ID and if "A" is present in any rows in group
# assign "A"; otherwise assign NA
df <- df %>% group_by(ID) %>% mutate(V2 = ifelse(any(V1 == "A"), "A", NA))
# overwrite "V1" with "A" if value in column "V2" is "A"
df$V1[which(df$V2 == "A")] <- "A"
# drop temporary column
df$V2 <- NULL
编辑:这是@thelatemail提供的单行内容。
dat %>% group_by(ID) %>% mutate(V1=if(any(V1=="A")) "A" else V1)
答案 1 :(得分:0)
我们可以使用data.table
,转换&#39; data.frame&#39;根据&#39; ID&#39;,if
分组&#39; A&#39; A&#39; %in%
V1
,指定(:=
&#39; A&#39; V1&#39;或else
返回&#39; V1&#39;
library(data.table)
setDT(df)[, V1 := if('A' %in% V1) 'A' else V1, ID]
df
# ID V1
#1: 1 A
#2: 1 A
#3: 1 A
#4: 2 A
#5: 2 A
#6: 2 A
#7: 3 B
#8: 3 C
#9: 3 C
或者我们可以使用base R
ave
解决方案
df$V1[with(df, ave(V1=="A", ID, FUN = any))] <- 'A'
df <- structure(list(ID = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), V1 = c("A",
"C", "B", "B", "A", "A", "B", "C", "C")), .Names = c("ID", "V1"
), class = "data.frame", row.names = c(NA, -9L))