我正在尝试在Ruby中使用此RegEx搜索:<div class="ms3">(\n.*?)+<
,但是只要我到达最后一个字符“&lt;”它完全停止工作。我已经在Rubular中测试了它,并且RegEx工作得非常好,我使用rubymine来编写我的代码,但我也使用Powershell测试它,它得到了相同的结果。没有错误消息。当我运行<div class="ms3">(\n.*?)+
时,它打印<div class="ms3">
这正是我正在寻找的,但只要我添加“&lt;”它没有任何结果。
我的代码:
#!/usr/bin/ruby
# encoding: utf-8
File.open('ms3.txt', 'w') do |fo|
fo.puts File.foreach('input.txt').grep(/<div class="ms3">(\n.*?)+/)
end
我正在搜索的一些内容:
<div class="ms3">
<span xml:lang="zxx"><span xml:lang="zxx">Still the tone of the remainder of the chapter is bleak. The</span> <span class="See_In_Glossary" xml:lang="zxx">DAY OF THE <span class="Name_Of_God" xml:lang="zxx">LORD</span></span> <span xml:lang="zxx">holds no hope for deliverance (5.16–18); the futility of offering sacrifices unmatched by common justice is once more underlined, and exile seems certain (5.21–27).</span></span>
</div>
<div class="Paragraph">
<span class="Verse_Number" id="idAMO_5_1" xml:lang="zxx">1</span><span class="scrText">Listen, people of Israel, to this funeral song which I sing over you:</span>
</div>
<div class="Stanza_Break"></div>
我需要做的完整RegEx是<div class="ms3">(\n.*?)+<\/div>
它拿起第一部分而没有别的
答案 0 :(得分:1)
问题始于使用File.foreach('input.txt')
将结果剪切成行。这意味着模式分别与每一行匹配,因此没有一行匹配模式(根据定义,没有一行在其中间有\n
。
你应该有更好的运气阅读整个文本块,并在其上使用match
:
File.read('input.txt').match(/<div class="ms3">(\n.*?)+<\/div>/)
# => #<MatchData "<div class=\"ms3\">\n <span xml:lang=\"zxx\">
# => <span xml:lang=\"zxx\">Still the tone of the remainder of the chapter is bleak. The</span>
# => <span class=\"See_In_Glossary\" xml:lang=\"zxx\">DAY OF THE
# => <span class=\"Name_Of_God\" xml:lang=\"zxx\">LORD</span></span>
# => <span xml:lang=\"zxx\">holds no hope for deliverance (5.16–18);
# => the futility of offering sacrifices unmatched by common justice is once more
# => underlined, and exile seems certain (5.21–27).</span></span>\n </div>" 1:"\n ">