与str_extract r的多个consitions

时间:2016-12-04 10:54:36

标签: r regex

我有一个数据框如下

  1     Tertiary seen.
  2     No tertiary seen.
  3     No anything seen.
  4     Tertiary everywhere.

我想仅在看到第三纪时才添加一列,而在正则表达号为*时不添加。看到了。

  1     Tertiary seen.        Tertiary
  2     No tertiary seen.       NA
  3     No anything seen.       NA
  4     Tertiary everywhere.  Tertiary 

我知道我可以在str_extract中使用|但是&似乎没有接受如下

Mydata$newcol<-str_extract(Mydata$Text,"[Tt]ertiary&!No.*[Tt]ertiary\\.")

2 个答案:

答案 0 :(得分:2)

您可以尝试使用Negative lookebehind,例如

Mydata$newcol[grepl("(?!No )Tertiary", Mydata$Text, perl = TRUE)] <- "Tertiary"

答案 1 :(得分:1)

&#34;和&#34;模式可以用&#34; NOT(不是A或不是B)&#34;来表示。图案。另请参阅regex - Regular Expressions: Is there an AND operator? - Stack Overflow

library(dplyr)
library(stringr)

Mydata <- data_frame(
  Text = c("Tertiary seen.",
           "No tertiary seen.",
           "No anything seen.",
           "Tertiary everywhere.")
  )

Mydata %>% 
  mutate(
    newcol = str_extract(Text, "^(^[Tt]ertiary|^No.*[Tt]ertiary\\.)")
  )
# A tibble: 4 × 2
# Text   newcol
# <chr>    <chr>
# 1       Tertiary seen. Tertiary
# 2    No tertiary seen.     <NA>
# 3    No anything seen.     <NA>
# 4 Tertiary everywhere. Tertiary