使用粘贴保留ifelse语句输出中的NA

时间:2014-10-23 13:31:03

标签: r if-statement dataframe na

我的数据类似于以下M1 - M4,我使用here中的代码生成M_NEW

M1    M2      M3      M4      M_NEW
1     1,2     0        1       1
3,4   3,4   1,2,3,4    4       3,4
NA    NA      1        2       NA

它会在四列中查找指定数量的数字,并在M_NEW中报告这些数字。现在,我想在每个观察中包含数字021,除非该观察值为NA。但是,到目前为止,我无法将021粘贴到观察结果中,而不会粘贴NA值。所需的输出包含在df以下M_NEW1。如何实现这一目标?看来我在这里错过了paste的内容。

# sample data
df <- structure(list(M1 = structure(c(3L, 4L, 2L, 2L, 1L, 5L, NA, 6L
), .Label = c("0", "1", "1,2", "1,2,3,4", "1,2,3,4,5", "3,4,5,6,7"
), class = "factor"), M2 = structure(c(3L, NA, 2L, 2L, 1L, 4L, 
NA, 5L), .Label = c("0", "1,2", "1,2,3,4,5", "4,5,6", "4,5,6,7,8,9,10,11,12,13,14"
), class = "factor"), M3 = structure(c(3L, NA, 1L, 1L, 1L, 2L, 
NA, 4L), .Label = c("0", "1,2,3,4", "1,2,3,4,5", "1,2,3,4,5,6,7,8"
), class = "factor"), M4 = structure(c(3L, NA, 1L, 2L, 1L, 5L, 
NA, 4L), .Label = c("0", "1", "1,2,3,4,5,6", "1,2,3,4,5,6,7,8,9,10,11,12", 
"4,5"), class = "factor"), M_NEW1 = structure(c(3L, NA, 1L, 2L, 
1L, 5L, NA, 4L), .Label = c("0,21", "1,0,21", "1,2,3,4,5,0,21", 
"3,4,5,6,7,8,0,21", "4,5,0,21"), class = "factor")), .Names = c("M1", 
"M2", "M3", "M4", "M_NEW1"), class = "data.frame", row.names = c(NA, 
-8L))

# function slightly modified from https://stackoverflow.com/a/23203159/1670053
f <- function(x, n=3) {
  tab <- table(strsplit(paste(x, collapse=","), ","))
  res <- paste(names(tab[which(tab >= n)]), collapse=",")
  return(ifelse(is.na(res), NA, ifelse(res == 0, "0,21", paste(res,",0,21",sep=""))))
  #return(ifelse(is.na(res), ifelse(res == 0, "0,21", NA), paste(res,",0,21",sep=""))) #https://stackoverflow.com/a/17554670/1670053
  #return(ifelse(is.na(res), NA, ifelse(res == 0, "0,21", paste(na.omit(res),",0,21",sep=""))))
  #return(ifelse(is.na(res), as.character(NA), ifelse(res == 0, "0,21", paste(res,",0,21",sep=""))))
}

df$M_NEW2 <- apply(df[, 1:4], 1, f))

1 个答案:

答案 0 :(得分:1)

你可以添加另一个if else声明 - 相当不优雅,但会让你到那儿。

f2 <- function(x, n=3) {
  tab <- table(strsplit(paste(x, collapse=","), ","))
  res <- paste(names(tab[which(tab >= n)]), collapse=",")
  res <- ifelse(res %in% c("0", ""), "0,21", res)
  if(res %in% c("NA","0,21")) res else paste(res, "0,21", sep=",") 
  }

apply(df[1:4], 1, f2)

# "1,2,3,4,5,0,21"   "NA"  "0,21"  "1,0,21"  "0,21"  "4,5,0,21"  "NA" 
#   "3,4,5,6,7,8,0,21"