Ruby regex将组合词单词用句点分隔

时间:2018-09-15 14:11:06

标签: ruby regex

我正在尝试使用Ruby正则表达式来获取单词组合,如下所示。 在下面的示例中,我只需要情况1-4,*用大写字母表示,以便于测试。中间的单词( <nav class="main-menu"> <ul> <li class="acc"> <a href="#"> <i class="fa fa-home fa-2x"></i> <span class="nav-text"> List One </span> </a> </li> <li class="has-subnav"> <a href="#"> <i class="fa fa-laptop fa-2x"></i> <span class="nav-text"> List Two </span> </a> </li> <li class="has-list"> <a href="#"> <i class="fa fa-list fa-2x"></i> <span class="nav-text"> List Three </span> </a> </li> </ul> </nav>可以是case#3,也可以是任何东西。我在如何使双重期限案例3正常工作方面遇到麻烦。也可以将独立的dbo, bcd用作单词,但是对于一个正则表达式来说可能太多了? 这是我的脚本,部分起作用,需要添加SALES

alpha..SALES

2 个答案:

答案 0 :(得分:2)

s = '1 alpha.dbo.SALES    2 alpha.bcd.SALES    3 alpha..SALES    4 SALES
bad cases 5x alpha.saleS  6x  saleSXX 7x alpha.abc.SALES.etc'

regex = /(?<=^|\s)(?:alpha\.[a-z]*\.)?(?:sales)(?=\s|$)/i
puts 'R: ' + s.scan(regex).to_s

输出:

R: ["alpha.dbo.SALES", "alpha.bcd.SALES", "alpha..SALES", "SALES"]

答案 1 :(得分:2)

r = /
    (?<=\d[ ])        # match a digit followed by a space in a positive lookbehind
    (?:               # begin a non-capture group
      \p{Alpha}+        # match one or more letters
      \.                # match a period
      (?:               # begin a non-capture group
        \p{Alpha}+      # match one or more letters
        \.              # match a period
        |               # or
        \.              # match a period
      )                 # end non-capture group
    )?                  # end non-capture group and optionally match it
    SALES             # match string
    (?!=[.\p{Alpha}]) # do not match a period or letter (negative lookahead)
    /x                # free-spacing regex definition mode.

s.scan(r)
  #=> ["alpha.dbo.SALES", "alpha.bcd.SALES", "alpha..SALES", "SALES"]

此正则表达式通常编写如下。

r = /
    (?<=\d )(?:\p{Alpha}+\.(?:\p{Alpha}+\.|\.))?SALES(?!=[.\p{Alpha}])/

在自由间距模式下,必须将空格放在字符类([ ])中;否则它将被剥离。