Question

我在R中有一个公式作为字符向量，并且需要从该公式中删除poly()（如果存在）。

示例，到目前为止，我的一些尝试（未成功）：

p <- "(.*)poly\\((\\w.*)(.*)(\\))(.*)"
unique(sub(p, "\\1", "mined + poly(cover, 3) + spp"))
#> [1] "mined + "
unique(sub(p, "\\2", "mined + poly(cover, 3) + spp"))
#> [1] "cover, 3"
unique(sub(p, "\\3", "mined + poly(cover, 3) + spp"))
#> [1] ""
unique(sub(p, "\\4", "mined + poly(cover, 3) + spp"))
#> [1] ")"
unique(sub(p, "\\5", "mined + poly(cover, 3) + spp"))
#> [1] " + spp"

我想要的结果：

输入："mined + poly(cover, 3) + spp"

输出："mined + cover + spp"

我尝试了很多模式，但是未删除poly( ..., 3)，或者在结果字符串中保留了, 3)或, 3……有任何帮助！（顺便说一下，3是任意的，该模式应删除所有度值...）

Answer 1

gsub("poly\\((.+),\\s*\\d+\\)", "\\1", inp)
# [1] "mined + cover + spp"

或者以一种更易于处理的逐步方式（因为您正努力使用更复杂的正则表达式）：

library(magrittr)
gsub("[^a-zA-Z]", " ", inp) %>% # Drop everything that is not a letter, add space instead
  gsub("poly", "", .) %>%       # Drop the word poly 
  gsub("\\s+", " + ", .)        # Add '+' back in. '\\s+' stands for one or more spaces
# [1] "mined + cover + spp"

Answer 2

尝试此正则表达式：

poly\(([^,]*)[^)]*\)

将比赛内容替换为第1组内容

Click for Demo

说明：

poly\(-匹配poly(
([^,]*)-匹配0+次出现的不是,的任何字符。这是在第1组中捕获的
[^)]*\)-匹配0+次出现的不是)后跟)的任何字符

现在将整个比赛替换为第1组内容

正则表达式-从公式中删除poly（）

2 个答案: