我有一个看起来像这样的数据框(但有更多的变量/列)
set.seed(5)
id<-seq(5)*floor(runif(5,min=1000, max=10000))
vals1<-c("Y","N","N","N","N")
vals2<-c("N","N","N","N","N")
vals3<-c("N","N","N","Y","N")
df<-data.frame(id,vals1,vals2,vals3)
我想在框架中创建一个最终列,以便生成一个具有以下逻辑的最终标志:如果有任何值,那么&#39; Y&#39;对于任何id,最终标志是&#39; Y&#39;否则它将是&#39; N&#39;因此,对于这个数据帧,第1和第4个ID(2801,14236)有一个&#39; Y&#39;在最后一栏中,其余的有一个&#39; n&#39;为最后一栏。我尝试了一些方法,如申请,如果......其他无济于事。
答案 0 :(得分:3)
通过为每一行指定“N”进行初始化。在下一步中,对于带有“Y”的行(使用apply
检查),请指定“Y”
df$final = "N"
df$final[apply(df, 1, function(a) "Y" %in% a)] = "Y"
答案 1 :(得分:2)
下面的信件编码解决方案。
set.seed(5)
id <- seq(5) * floor(runif(5, min=1000, max=10000))
vals1 <- c("Y","N","N","N","N")
vals2 <- c("N","N","N","N","N")
vals3 <- c("N","N","N","Y","N")
df <- data.frame(id, vals1, vals2, vals3)
# If you really want to use the letter encoding, my solution works as below
df$Final <- apply(df[,2:4], MARGIN = 1, FUN = function(x) {any(x == 'Y')})
但是,我认为你应该使用布尔值(TRUE / FALSE)。
与apply
和any
set.seed(5)
id <- seq(5) * floor(runif(5, min=1000, max=10000))
vals1 <- c("Y","N","N","N","N")
vals2 <- c("N","N","N","N","N")
vals3 <- c("N","N","N","Y","N")
df <- data.frame(id, vals1, vals2, vals3)
# Convert your labels into booleans:
df[,2:4] <- df[,2:4] == 'Y'
# Then summarise across rows
df$Final <- apply(df[,2:4], MARGIN = 1, FUN = function(x) {any(x)})
答案 2 :(得分:1)
与@ d.b回答有点相似:
df$final <- apply(df, 1, function(x) c("N","Y")[any(x == "Y")+1])