我有一个字符串,其中有几个部分名为“Section 1”......“Section 20”,并希望将此字符串拆分为这些单独的部分。这是一个例子:
Stuff we don't care about
Section 1
Text within this section, may contain the word section.
And go on for quite a bit.
Section 15
Another section
我想把它分成
["Section 1\n Text within this section, may contain the word section.\n\nAnd go in for quite a bit.",
"Section 15 Another section"]
我感觉很蠢,因为没有把它弄好。我的尝试总能抓住一切。现在我有
/(Section.+\d+$[\s\S]+)/
但我不能从中得到贪婪。
答案 0 :(得分:0)
在我看来,分割文本的Regexp
如下:
/(?:\n\n|^)Section/
所以代码是:
str = "
Stuff we don't care about
Section 1
Text within this section, may contain the word section.
And go on for quite a bit.
Section 15
Another section
"
newstr = str.split( /(?:\n\n|^)Section/, -1 )[1..-1].map {|l| "Section " + l.strip }
# => ["Section 1\nText within this section, may contain the word section.\n\nAnd go on for quite a bit.", "Section 15\nAnother section"]
答案 1 :(得分:0)
答案 2 :(得分:0)
您可以将scan
与此正则表达式/Section\s\d+\n(?:.(?!Section\s\d+\n))*/m
string.scan(/Section\s\d+\n(?:.(?!Section\s\d+\n))*/m)
Section\s\d+\n
将匹配任何Section标头
(?:.(?!Section\s\d+\n))*
将匹配其他任何内容。
m
也会让点匹配换行符
sample = <<SAMPLE
Stuff we don't care about
Section 1
Text within this section, may contain the word section.
And go on for quite a bit.
Section 15
Another section
SAMPLE
sample.scan(/Section\s\d+\n(?:.(?!Section\s\d+\n))*/m)
#=> ["Section 1\nText within this section, may contain the word section.\n\nAnd go on for quite a bit.\n", "Section 15\nAnother section\n"]
答案 3 :(得分:0)
我认为最简单的事情是:
str = "Stuff we don't care about
Section 1
Text within this section, may contain the word section.
And go on for quite a bit.
Section 15
Another section"
str[/^Section 1.+/m] # => "Section 1\nText within this section, may contain the word section.\n\nAnd go on for quite a bit.\n\nSection 15\nAnother section"
如果您要删除Section
标题中的部分,请以同样的方式开始,然后利用Enumerable的slice_before
:
str = "Stuff we don't care about
Section 1
Text within this section, may contain the word section.
And go on for quite a bit.
Section 15
Another section"
str[/^Section 1.+/m].split("\n").slice_before(/^Section \d+/m).map{ |a| a.join("\n") }
# => ["Section 1\nText within this section, may contain the word section.\n\nAnd go on for quite a bit.\n",
# "Section 15\nAnother section"]
slice_before
文档说:
为每个chunked元素创建一个枚举器。块的开头由模式和块定义。