R -- Logical grep on multiple variables within data frame

时间:2016-02-12 19:29:49

标签: r grepl

I am interested in performing a string search using logical grep (grepl) in R, with multiple string patterns, and would like to apply this function to several variables (columns) in my data frame. I believe that one of the apply functions is going to be well-suited to this task, but I am not entirely sure how to get it to work correctly. Please find an example (toy) included below:

v.grepl <- Vectorize(grepl)
pattern <- "^330|^334|^335|^343|^359|^740|^741|^742"
data <- structure(list(recnum = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
                   pr1_3 = c("334", "550", "600", "812", "748", "968", "123", "456", "789", "821"),
                   pr2_3 = c("350", "222", "367", "", "", "", "", "", "", ""),
                   pr3_3 = c("857", "", "", "", "", "", "", "", "", ""), 
                   pr4_3 = c("359", "740", "336", "400", "", "", "", "", "", ""),
                   pr5_3 = c("800", "", "", "", "", "", "", "", "", "")),
              .Names = c("recnum", "pr1_3", "pr2_3", "pr3_3", "pr4_3", "pr5_3"),
              row.names = c(1L, 2L, 3L,4L, 5L, 6L, 7L, 8L, 9L, 10L),
              class = "data.frame")

data$check <- apply(data, 2, v.grepl(pattern, data[c('pr1_3', 'pr2_3', 'pr3_3', 'pr4_3', 'pr5_3')]))

The last line of code throws the following error:

Error in match.fun(FUN) : 
'v.grepl(pattern, data[c("pr1_3", "pr2_3", "pr3_3", "pr4_3", "pr5_3")])' is not a function, character or symbol

Does anyone have any ideas for how to fix this code so that it adds a new variable within the data dataframe (called check) that flags whether each row has a pr1_3 through pr5_3 that matches the strings included in pattern?

Thanks!

0 个答案:

没有答案