R正则表达式

时间:2019-01-24 01:12:49

标签: r

我正在学习sub&gsub函数, 在阅读了定义之后,我仍然不明白是什么

“。*”,“ \ s”

具体来说,问题是以下代码块返回什么,我不知道它如何工作

awards <- c("Won 1 Oscar.", "Won 1 Oscar. Another 9 wins & 24 nominations.", "1 win and 2 nominations.", "2 wins & 3 nominations.", "Nominated for 2 Golden Globes. 1 more win & 2 nominations.", "4 wins & 1 nomination.")

sub(".*\\s([0-9]+)\\snomination.*$", "\\1", awards)

1 个答案:

答案 0 :(得分:0)

“。*” =。表示任何字符,并且*为0或更大的字符。
“ \ s” =表示任何空格

所以

sub(".*           #match any character 0 or more times
\\s               # follow by a space (whitespace)
([0-9]+)          # with at least 1 number the () means extract
\\s               # follow by another space
nomination        # follow by the word "nomination"
.*$",             # with 0 or more characters from end of the line
 "\\1", awards)   # //1 means replace with the first match

给出字符串示例,第一个字符串中没有单词提名,因此将返回原始字符串。其他字符串都将匹配,因此将重新调整紧接单词“提名”之前的数字。

希望这会有所帮助。