Question

是的，我知道，＆＃34;不要用正则表达式解析HTML＆＃34;。我在记事本++中这样做，这是一次性的事情，所以请耐心等待一下。

我试图通过使用一些更高级的技术来简化一些HTML代码。值得注意的是，我有＆＃34;插入＆＃34;或＆＃34;标注＆＃34;或者你称之为的，在我的文档中，表明＆＃34;注意＆＃34;，＆＃34;警告＆＃34;和＆＃34;技术＆＃34;短语，以吸引读者注意重要信息：

<div class="note">
    <p><strong>Notes</strong>: This icon shows you something that complements 
     the information around it. Understanding notes is not critical but 
     may be helpful when using the product.</p>
</div>
<div class="warning">
    <p><strong>Warnings</strong>: This icon shows information that may 
     be critical when using the product. 
     It is important to pay attention to these warnings.</p>
</div>
<div class="technical">
    <p><strong>Technical</strong>: This icon shows technical information 
     that may require some technical knowledge to understand. </p>
</div>

我想将此HTML简化为以下内容：

<div class="box note"><strong>Notes</strong>: This icon shows you something that complements 
     the information around it. Understanding notes is not critical but 
     may be helpful when using the product.</div>
<div class="box warning"><strong>Warnings</strong>: This icon shows information that may 
     be critical when using the product. 
     It is important to pay attention to these warnings.</div>
<div class="box technical"><strong>Technical</strong>: This icon shows technical information 
     that may require some technical knowledge to understand.</div>

我几乎拥有进行良好全局搜索所需的正则表达式＆amp;在我的项目中替换notepad ++，但它没有接受＆＃34;只有＆＃34;第一个div，它会拾取所有这些 - 如果我的光标位于我的文件的开头，那么＆＃34;选择＆＃34;当我点击查找是从第一个<div class="something">直到最后</div>，基本上。

这里是我的表达：<div class="(.*[^"])">[^<]*<p>(.*?)<\/p>[^<]*<\/div>（记事本++＆＃34;自动＆＃34;在其周围添加/ /，有点）。

我做错了什么，在这里？

Answer 1

在匹配class属性时，你有一个贪婪的点量词 - 那是造成你问题的邪恶家伙。

让它非贪婪：<div class="(.*?[^"])">或将其更改为字符类：<div class="([^"]*)">。

比较：greedy class与non-greedy class。

正则表达式仅匹配第一次出现的html元素

1 个答案: