Question

我想在字符串中获取Genus和种类名称。示例：

"He saw a Panthera leo in the savanna"

我希望"Panthera leo"指定属名称。

我尝试使用word函数（包stringr）：

my_sentence<-"He saw a Panthera leo in the savanna"
word(my_sentence,"Panthera",+1)

我知道问题来自＆＃34; + 1＆＃34;论点。你有什么线索吗？

也许我应该使用gsub函数？

Answer 1

my_sentence<-"He saw a Panthera leo in the savanna"
x = strsplit(my_sentence, " ")
index = grep("Panthera", x, value=F)
want =x[c(index, index+1)][[1]]

Answer 2

正则表达式福：

> m <- gregexpr('panthera\\W\\w\\w+', "He saw a Panthera leo in the 
savanna", ignore.case = T)
> regmatches("He saw a Panthera leo in the savanna", m)
[[1]]
[1] "Panthera leo"

\W\w\w+是一个非单词字符，一个单词字符，然后是一个或多个单词字符。这意味着在panthera之后必须至少有2个字符。

在stringr中，它看起来像这样：

> s <- "He saw a Panthera leo in the savanna"
> pat <- regex('panthera\\W\\w\\w+', ignore_case = T)
> str_extract(s, pat)
[1] "Panthera leo"

我想我更喜欢。

在字符串中获取特定单词和以下单词

2 个答案: