我正在尝试使用Ruby正则表达式来获取单词组合,如下所示。
在下面的示例中,我只需要情况1-4,*用大写字母表示,以便于测试。中间的单词( <nav class="main-menu">
<ul>
<li class="acc">
<a href="#">
<i class="fa fa-home fa-2x"></i>
<span class="nav-text">
List One
</span>
</a>
</li>
<li class="has-subnav">
<a href="#">
<i class="fa fa-laptop fa-2x"></i>
<span class="nav-text">
List Two
</span>
</a>
</li>
<li class="has-list">
<a href="#">
<i class="fa fa-list fa-2x"></i>
<span class="nav-text">
List Three
</span>
</a>
</li>
</ul>
</nav>
可以是case#3,也可以是任何东西。我在如何使双重期限案例3正常工作方面遇到麻烦。也可以将独立的dbo, bcd
用作单词,但是对于一个正则表达式来说可能太多了?
这是我的脚本,部分起作用,需要添加SALES
alpha..SALES
答案 0 :(得分:2)
s = '1 alpha.dbo.SALES 2 alpha.bcd.SALES 3 alpha..SALES 4 SALES
bad cases 5x alpha.saleS 6x saleSXX 7x alpha.abc.SALES.etc'
regex = /(?<=^|\s)(?:alpha\.[a-z]*\.)?(?:sales)(?=\s|$)/i
puts 'R: ' + s.scan(regex).to_s
输出:
R: ["alpha.dbo.SALES", "alpha.bcd.SALES", "alpha..SALES", "SALES"]
答案 1 :(得分:2)
r = /
(?<=\d[ ]) # match a digit followed by a space in a positive lookbehind
(?: # begin a non-capture group
\p{Alpha}+ # match one or more letters
\. # match a period
(?: # begin a non-capture group
\p{Alpha}+ # match one or more letters
\. # match a period
| # or
\. # match a period
) # end non-capture group
)? # end non-capture group and optionally match it
SALES # match string
(?!=[.\p{Alpha}]) # do not match a period or letter (negative lookahead)
/x # free-spacing regex definition mode.
s.scan(r)
#=> ["alpha.dbo.SALES", "alpha.bcd.SALES", "alpha..SALES", "SALES"]
此正则表达式通常编写如下。
r = /
(?<=\d )(?:\p{Alpha}+\.(?:\p{Alpha}+\.|\.))?SALES(?!=[.\p{Alpha}])/
在自由间距模式下,必须将空格放在字符类([ ]
)中;否则它将被剥离。