Question

在文本中，我希望找到像每个文本之类的结构，但在某个单词之间不匹配。

文字示例：

Templates : You can add custom templates for your theme. Updated on 2010 look[124] end
Media RSS feed : Add the Cooliris Effect to your gallery Updated on 2011 look[124]
Role settings : Each gallery has a author Updated at 2010 ...  look[124] end
AJAX based thumbnail generator : No more server Updated on 2010 look[124] end limitation during the batch process Copy/Move : Copy or move images between Updated on 2010 this look[124] galleries Sortable Albums : Create your own sets of images Updated on 2010 this look[124] end
Upload or pictures via a zip-file (Not in Safe-mode)
Watermark function : You can add a watermark image or text 
...

我需要找到“已更新。* [124] 结束”每个匹配必须以“更新”开头，并以“[数字]”和单词“结束结尾”。但有些文字看起来非常相似，但不以“结束”结尾。此文字必须 not mach 。如何使它工作？

我尝试写

/Updated(.*?)\[.*?\]\send/msi

或

Updated(.*?)\[.*?\](?!Updated)\send

但这需要像：

Updated on 2011 look[124] Role settings : Each gallery has a author Updated at 2010 ...  look[124] end
Updated on 2010 this look[124] galleries Sortable Albums : Create your own sets of images Updated on 2010 this look[124] end

如何写正则表达式女巫跳过不好的比赛？

http://regexr.com?2vh1j

感谢您的意见。

Answer 1

假设所有无效匹配都有[124]，而不是end，则可以通过[与Updated之间的Updated([^[]*?)\[\d*\]\send和结束序列来过滤掉这些匹配，像这样：

{{1}}

Answer 2

要匹配不包含Updated的字符串，您可以使用以下结构：

(?:[^U]+|U(?!pdated))*

和

(?:(?!Updated).)*

使用第一个替代方法会给你一个表达式：

Updated((?:[^U]+|U(?!pdated))*)\[\d+\]\send

第一个替代解释：

(?:          # non-capturing group
[^U]+        # any characters that aren't "U"
|U(?!pdated) # or a "U" which is not followed bu "pdated" (ie. not "Updated")
)*           # repeated as much as possible

第二种选择：

(?:          # non-capturing group
(?!Updated). # Use a lookahead check at every character to make sure it's not "Updated"
)*           # repeated as much as possible

Answer 3

我认为这是你在第二个正则表达式中所尝试的：

Updated\s++(?>(?!Updated\b|end\b)\S+\s+)*+end\b

换句话说，匹配Updated并查找相应的end。如果你先找到另一个Updated，你知道你是从错误的地方开始的，所以放弃那个匹配。我也排除end，因为这样可以让我匹配所占的字（即*+};正则表达式永远不必回溯找到或（更重要的是）消除匹配。

如果你真的必须指定look[nnn]部分，这应该可以解决问题：

Updated\s++(?>(?!Updated\b|end\b|look\[\d+\])\S+\s+)*+look\[\d+\]\s+end\b

如果需要，为不区分大小写的匹配添加i标志，但不需要m或s标志。如果这看起来过于复杂，那是因为我不像你那样了解你的数据。这是你真正需要的所有机会：

Updated(?:(?!Updated).)*\send

Answer 4

使用lazy regexp

Updated.*?\[.*?\]( end)?

Answer 5

一种可能性：

Updated([^[]*)\[124\]\s+end

说明：

Updated          # Word 'updated'
[^[]*            # All chars until '['
\[124\]          # String '[124]'
\s+              # One or more spaces.
end              # String 'end'

Answer 6

也许你可以尝试不同的方法：

/Updated[\w.\s]*\[\d+\]\send/

<强>解释

Updated

这将匹配单词Updated

[\w\d.\s]*

然后是所有字母，数字，空格和点（你可以添加任何你想要的字符）

\[\d+\]

然后是括号之间的数字

\send

而不是空格，最后是单词结束

如何找到每一件事而不是一句话

6 个答案: