在r中查找具有相同开口的字符串

时间:2018-06-08 18:36:58

标签: r regex string

假设我有一个带有如下列的df:

"Jim plays football and Mike plays soccer."

"Jim plays soccer and Mary plays the piano."

"Mike plays football and Mary plays soccer."

"Mary plays volleyball and Jim plays the piano."

...

我是否可以使用任何正则表达式将所有以" Jim"开头的句子,以#34; Mike"开头的所有句子,以及所有开头的句子都归还给我与"玛丽"?

我不知道如何实现这一目标,因为我认为你必须知道在使用正则表达式时你正在寻找什么,但这里我搜索的内容各不相同。

非常感谢。

1 个答案:

答案 0 :(得分:1)

您可以将gsubsplit组合在一起

  

^(\\w+)会查找句子中的第一个单词

split(sentences, gsub("^(\\w+).*", "\\1", sentences))

# $Jim
# [1] "Jim plays football and Mike plays soccer."  "Jim plays soccer and Mary plays the piano."

# $Mary
# [1] "Mary plays volleyball and Jim plays the piano."

# $Mike
# [1] "Mike plays football and Mary plays soccer."