R中数字的正则表达式

时间:2015-09-09 08:04:22

标签: regex r decimal points

我需要编写一个正则表达式来解析以下数据:

[1] "Chicken (30.67%);Duck (17.3%);Wild duck (16%);Pigeon (4%);
[2] "Chicken (30.67%);Duck (17.3%);Wild duck (16%);Blue-winged teal (4%)

这就是我所拥有的:

"(\\w[\\w\\s]+)\\(([0-9]+\\.[0-9][0-9]?)%\\);?"

它有效,但我有几个问题:

  • 它不识别10 +%(例如30.67%)
  • 没有小数点(16%)或小数点少1(17.3%)时无法识别

有人可以帮忙吗?

1 个答案:

答案 0 :(得分:1)

这应该有所帮助:

   library(stringr)
    str_extract_all(text, pattern = "[0-9]{1,2}(\\.[0-9]{1,2})?%")

正则表达式的解释:

[0-9]{1,2} there are one or two digits between 0-9
  (        start Group
    \\.    a dot (have to escape it with double backslash, otherwise special character
    [0-9]{1,2} there are one or two digits between 0-9
  )?       end group, group may exists, but must not
 %         percent dign