Question

我试图写一个包含三个匹配组的正则表达式。我想要匹配的字符串/文本如下：

<td class="no-wrap past-rating" style="background-color: rgb(228, 254, 199);">
                    <div>
                        <b class="place">2</b><sup> 1</sup><sup class="remaining"> 1/2</sup>
                    </div>
                    <div>
                        46.96
                    </div>
                </td>

我试图匹配：2,1和1/2。

我已经编写了以下正则表达式，这些正则表达式在独立的基础上匹配所需的文本，但当我结合任意两个或全部三个时，我就会知道匹配。

/(?<one>(?<=<b class="place">).*(?=<\/b>))/ matches=> 2 

/(?<two>(?<=<\/b><sup>).*?(?=<\/sup><sup class=))/ matches=> 1

 /(?<three>(?<=="remaining">).*(?=<\/sup>))/ matches => 1/2

不幸的是，

/(?<one>(?<=<b class="place">).*(?=<\/b>))(?<two>(?<=<\/b><sup>).*?(?=<\/sup><sup class=))(?<three>(?<=="remaining">).*(?=<\/sup>))/

无法匹配任何内容。任何人都可以告诉我我哪里出错了，为什么合并的正则表达式失败并且单个表达式成功匹配。

Answer 1

也许你应该尝试这样的事情：

/<b class="place">(.*)<\/b><sup>\s*(.*)<\/sup><sup class="remaining">\s*(.*)<\/sup>/

Demo online

Answer 2

我猜你可以制作一个更简单的正则表达式，即：

/>\s*?([\d\/]+)\s*?<\//

输出：

MATCH 1
`2`
MATCH 2
`1`
MATCH 3
`1/2`

演示：

https://regex101.com/r/dC7zR5/1

说明：

/>\s*?([\d\/]+)\s*?<\//gm

    > matches the characters > literally
    \s*? match any white space character [\r\n\t\f ]
        Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
    1st Capturing group ([\d\/]+)
        [\d\/]+ match a single character present in the list below
            Quantifier: + Between one and unlimited times, as many times as possible, giving back as needed [greedy]
            \d match a digit [0-9]
            \/ matches the character / literally
    \s*? match any white space character [\r\n\t\f ]
        Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
    < matches the characters < literally
    \/ matches the character / literally
    g modifier: global. All matches (don't return on first match)
    m modifier: multi-line. Causes ^ and $ to match the begin/end of each line (not only begin/end of string)

Answer 3

要“合并”正则表达式，您需要使用交替运算符|：

(?<one>(?<=<b class="place">).*(?=<\/b>))|(?<two>(?<=<\/b><sup>).*?(?=<\/sup><sup class=))|(?<three>(?<=="remaining">).*(?=<\/sup>))

请参阅demo

但是，由于它是你试图匹配的HTML部分，我使用的是能够处理模式标记中多个属性的正则表达式，并且输入文本中有多行，如下所示：

<b\b[^<]*class="place"[^<]*>(?<one>[^<]*)|<\/b><sup[^<]*>(?<two>[^<]*)|="remaining"[^<]*>(?<three>[^<]*(?=<\/sup>))

请参阅another demo

为什么我的正则表达式单独工作但在我将它们组合时失败？

3 个答案: