Question

我希望我没问题，我搜索了stackoverflow并发现了similair问题，但没有解决方案适用于我。

我有这样的HTML： <h1>Beatles: A Hard Days Night</h1>现在我想要一个正则表达式匹配结肠后的所有内容。在这种情况下A Hard Days Night。

这就是我的尝试：

$pattern = "/<h1>\:(.*)<\/h1>/";

但这只是输出一个空数组。

Answer 1

以下正则表达式应与之匹配：

<h1>[^:]+:\s+([^<]+)

PowerShell测试：

PS> '<h1>Beatles: A Hard Days Night</h1>' -match '<h1>[^:]+:\s+([^<]+)'; $Matches
True

Name                           Value
----                           -----
1                              A Hard Days Night
0                              <h1>Beatles: A Hard Days Night

一点解释：

<h1>    # match literal <h1>
[^:]+   # match everything *before* the colon (which in this case
        # shouldn't include a colon itself; if it does, then use .*)
:       # Literal colon
\s+     # Arbitrary amount of whitespace
([^<]+) # Put everything up to the next < into a capturing group.

正则表达式：H1标签中的冒号后匹配？

1 个答案: