使用带有sapply

时间:2017-06-12 14:58:58

标签: r sapply

我正在尝试使用逻辑向量来告诉' sapply哪些列在我的数据集中生成数字。

在我的数据中,有NA,但所有变量都是数字或字符。我采取了第一个完整的案例(下面的硬代码,但是会喜欢建议!)并根据字符串中的第一个字符是数字还是字母来制作逻辑向量。我想使用该逻辑向量告诉sapply哪些列是数字。

#make data frame, this should return an all 'character' data frame
color <- c("red", "blue", "yellow")
number <- c(NA, 1, 3)
other.number <- c(4, 5, 7)
df <- cbind(color, number, other.number) %>% as.data.frame()

#get the first character of the variables in the first complete case
temp <- sapply(df, function(x) substr(x, 1, 1)) %>% as.data.frame() %>%
  .[2,] %>% # hard code, this is the first 'complete case'
  gather() %>%
  #make the logical variable, which can be used as a vector
  mutate(vec= ifelse(value %in% letters, FALSE, TRUE)) # apply this vector to sapply + as.numeric to the df

1 个答案:

答案 0 :(得分:0)

This is a strange case, but If you need to convert numeric columns based on their first element, then an idea would be to convert it to numeric. Since any element that is not a number will return NA (as the warning states), you can use that to index. For example,

ind <- sapply(na.omit(df), function(i) !is.na(as.numeric(i[1])))

Warning message: In FUN(X[[i]], ...) : NAs introduced by coercion

ind
#       color       number other.number 
#       FALSE         TRUE         TRUE 

df[ind] <- lapply(df[ind], as.numeric)

str(df)
#'data.frame':  3 obs. of  3 variables:
# $ color       : chr  "red" "blue" "yellow"
# $ number      : num  NA 1 3
# $ other.number: num  4 5 7

DATA

dput(df)
structure(list(color = c("red", "blue", "yellow"), number = c(NA, 
"1", "3"), other.number = c("4", "5", "7")), .Names = c("color", 
"number", "other.number"), row.names = c(NA, -3L), class = "data.frame")