R:你如何在lapply()中应用grep()

时间:2016-03-14 05:36:48

标签: r lapply sapply tapply

我想在R中应用grep(),但我在lapply()中并不是很好。我知道lapply能够获取一个列表,将函数应用于每个成员并输出一个列表。例如,让x成为一个包含2个成员的列表。

> x<-strsplit(docs$Text," ")
> 
> x
[[1]]
 [1] "I"         "lovehttp"  "my"        "mum."      "I"         "love"     
 [7] "my"        "dad."      "I"         "love"      "my"        "brothers."

[[2]]
 [1] "I"         "live"      "in"        "Eastcoast" "now."      "Job.I"    
 [7] "used"      "to"        "live"      "in"        "WestCoast."  

我想应用grep()函数来删除由http组成的单词。所以,我会申请:

> lapply(x,grep(pattern="http",invert=TRUE, value=TRUE))

但它没有用,它说

Error in grep(pattern = "http", invert = TRUE, value = TRUE) : 
argument "x" is missing, with no default

所以,我试过

> lapply(x,grep(pattern="http",invert=TRUE, value=TRUE,x))

但它说

Error in match.fun(FUN) : 
'grep(pattern = "http", invert = TRUE, value = TRUE, x)' is not a 
function, character or symbol

请帮助,谢谢!

2 个答案:

答案 0 :(得分:4)

以下代码行将从列表中包含子字符串http的向量中删除所有条目:

repx <- function(x) {
    y <- grep("http", x)
    vec <- rep(TRUE, length(x))
    vec[y] <- FALSE
    x <- x[vec]
    return(x)
}

lapply(lst, function(x) { repx(x) })

数据:

x1 <- c("I", "lovehttp", "my", "mum.", "I", "love", "my", "dad.", "I", "love", "my", "brothers.")
x2 <- c("I", "live", "in", "Eastcoast", "now.", "Job.I", "used", "to", "live", "in", "WestCoast.")
lst <- list(x1, x2)

答案 1 :(得分:4)

这可以在一行中完成:

lst <- lapply(lst, grep, pattern="http", value=TRUE, invert=TRUE)

#lst
#[[1]]
# [1] "I"         "my"        "mum."      "I"         "love"      "my"        "dad."      "I"         "love"      "my"        "brothers."
#
#[[2]]
# [1] "I"          "live"       "in"         "Eastcoast"  "now."       "Job.I"      "used"       "to"         "live"       "in"         "WestCoast."

如果您不想删除包含该模式的整个单词,只删除该模式本身,同时保留其余单词(如评论中所述),则可以使用gsub代替grep

lapply(lst, gsub, pattern="http", replacement="")
#[[1]]
# [1] "I"         "love"      "my"        "mum."      "I"         "love"      "my"        "dad."      "I"         "love"      "my"        "brothers."
#
#[[2]]
# [1] "I"          "live"       "in"         "Eastcoast"  "now."       "Job.I"      "used"       "to"         "live"       "in"         "WestCoast."