我有一个数据框如下
1 Tertiary seen.
2 No tertiary seen.
3 No anything seen.
4 Tertiary everywhere.
我想仅在看到第三纪时才添加一列,而在正则表达号为*时不添加。看到了。
1 Tertiary seen. Tertiary
2 No tertiary seen. NA
3 No anything seen. NA
4 Tertiary everywhere. Tertiary
我知道我可以在str_extract中使用|
但是&似乎没有接受如下
Mydata$newcol<-str_extract(Mydata$Text,"[Tt]ertiary&!No.*[Tt]ertiary\\.")
答案 0 :(得分:2)
您可以尝试使用Negative lookebehind,例如
Mydata$newcol[grepl("(?!No )Tertiary", Mydata$Text, perl = TRUE)] <- "Tertiary"
答案 1 :(得分:1)
&#34;和&#34;模式可以用&#34; NOT(不是A或不是B)&#34;来表示。图案。另请参阅regex - Regular Expressions: Is there an AND operator? - Stack Overflow。
library(dplyr)
library(stringr)
Mydata <- data_frame(
Text = c("Tertiary seen.",
"No tertiary seen.",
"No anything seen.",
"Tertiary everywhere.")
)
Mydata %>%
mutate(
newcol = str_extract(Text, "^(^[Tt]ertiary|^No.*[Tt]ertiary\\.)")
)
# A tibble: 4 × 2
# Text newcol
# <chr> <chr>
# 1 Tertiary seen. Tertiary
# 2 No tertiary seen. <NA>
# 3 No anything seen. <NA>
# 4 Tertiary everywhere. Tertiary