Question

我需要一个正则表达式模式来匹配<a href="https://website.com">Health & Beauty</a>之间的任何文本，该文本可能包含也可能不包含空格和/或特殊字符“＆”，但不应超过10个字符限制。我想提取：

美容与时尚

以下是提取锚文本的regix代码：

（<[[a | A] [^>] *> |）

但是我想将字符限制为1到10？那有可能吗？

Answer 1

对于PCRE：

https://regex101.com/r/GJSlZl/1

对于JS：

https://regex101.com/r/FIdlyU/1

解决方案取决于正则表达式的风格：

js：(?<=<a[^>]+>)([\w &]{1,10})(?=<\/a>)
pcre：<a[^>]+>\K([\w &]{1,10})(?=<\/a>)

Answer 2

我的猜测是，您正在寻找类似于以下内容的表达式

(?<=&|>)([^&\r\n]{0,10}(?=&|<\/a>))*

您可能想在左侧添加更多边界，

(?<=&|>)

测试

$re = '/(?<=&|>)([^&\r\n]{0,10}(?=&|<\/a>))*/s';
$str = '<a>Health & Beauty</a>
<a href="https://website.com">Health & Beauty</a>
<a href="https://website.com">Health & Beauty 1 & Health & Beauty 1 </a>
<a>Health & Beauty 1 & Health & Beauty 1 </a>
<a>Health & Beauty 1 & Some other words & Beauty 1 & Some other words 2</a>

';

preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);

var_dump($matches);

如果您想探索/简化/修改表达式，可以在右上角的面板上进行了说明 regex101.com。如果您愿意，也可以在this link中观看它的匹配方式针对一些样本输入。

正则表达式以匹配<a href="#">我需要的文本</a>与最多20个字符的文本

2 个答案:

测试