我试图找到数据库中的所有行,其字段具有非锚标记,其href属性以{clickurl}字符串开头。例如,这个 -
<link foo="bar" href="{clickurl}http://wwww.google.com" ...
或者这个(因为它有一个符合标准的非锚标签) - HTTP://wwww.google.com" ... HTTP://wwww.google.com" ...
但不是这个(因为它是锚标签) - HTTP://wwww.google.com" ...
到目前为止我做了什么
使用以下正则表达式,我能够获得所有记录,其中链接标记具有以{clickurl}开头的href属性 -
SELECT bannerid FROM ox_banners WHERE htmltemplate REGEXP "<link[^>]*href\s*=\s*[\"'][^>]*{clickurl}(.*)[\"']"
但是,因为我不仅需要搜索链接标签,还需要搜索任何其他标签(不包括锚标签),我将正则表达式修改为 -
SELECT bannerid FROM ox_banners WHERE htmltemplate REGEXP "<[!a][^>]*href\s*=\s*[\"'][^>]*{clickurl}(.*)[\"']"
但这也是返回锚标签包含此模式的行。
更新
使用zx81的输入,我现在使用此表达式<[^a][^>]*href[[:space:]]*=[[:space:]]*[\"'][^>]*{clickurl}(.*)[\"']
,并且在正常情况下只有非锚标记匹配,但是在如下情况下,当href属性位于echo语句内的标记上时在PHP标记内,它也匹配(不需要),因为它实际上是一个锚标记的href -
<?php
$GLOBALS['test'] = '{clickurl}tel://test';
echo '<a href="{clickurl}test">Test</a>';
?>
在
我仍然在寻找这个解决方案。
答案 0 :(得分:2)
试试这个:
SELECT bannerid FROM ox_banners WHERE htmltemplate REGEXP ".*<[^a][^>]*href=\"\\{clickurl\\}.*";
Options: Case insensitive; Regex syntax only
Match any single character that is NOT a line break character (line feed) «.*»
Between zero and unlimited times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives «*»
Match the character “<” literally «<»
Match any single character that is NOT present in the list below and that is NOT a line break character (line feed) «[^a]»
The literal character “a” (case insensitive) «a»
Match any single character that is NOT present in the list below and that is NOT a line break character (line feed) «[^>]*»
Between zero and unlimited times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives «*»
The literal character “>” «>»
Match the character string “href="” literally (case insensitive) «href="»
Match the character “{” literally «\{»
Match the character string “clickurl” literally (case insensitive) «clickurl»
Match the character “}” literally «\}»
Match any single character that is NOT a line break character (line feed) «.*»
Between zero and unlimited times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives «*»
答案 1 :(得分:1)
请尝试使用此正则表达式:
< *[^a][^>]+ *href *= *"{clickurl}
你快到了。看起来你有一个小错字:你有[!a]
而不是[^a]
表示&#34;一个字符不是&#34; a&#34;。
[^a]
和[^>]
几乎相同。我相信你知道这一点,但在这两种情况下,^
表示&#34;不是&#34;,所以[^>]
是任何不是>
的字符
如果您不仅要允许空格字符而不允许其他类型的空格,而不是*
,则可以使用[[:space:]]*
感谢Tuga提醒我\s
在MySQL中不起作用:它匹配文字&#34; s&#34;。我有&#34;间隔&#34;在这一个。 :)