仅在断行时插入换行符,以使行不超过80个字符

时间:2018-12-22 10:06:37

标签: ruby regex string

我有各种长度可变的文本块,其中可能有也可能没有几个换行符。我想使用此文本的每一行,并插入其他换行符,但仅在断字(空格)处插入,以便没有一行超过80个字符。它们可以在80岁以下,但是我希望它们尽可能地接近80个字符,而不用翻过来,也不要将单词切成两半。

以下是一些示例内容:

"""
Mittens the cat ate a salad on Friday morning. He's a cat, so I'm not really sure why he was eating a salad, but that's what he was doing. Do cats normally like salad? It wasn't salmon-flavored or anything crazy like that.

I can think of three reasons why a cat might eat salad:
1. The cat is insane
2. The cat likes mice, and just last week we noticed that a mice was eating some salad, so maybe the cat decided to shortcut the food chain and eat a salad instead of getting the nutrients of the salad from the consumption of the mouse.
3. Cats are weird.
"""

运行newline-inserter之后,它将显示为:

"""
Mittens the cat ate a salad on Friday morning. He's a cat, so I'm not really
sure why he was eating a salad, but that's what he was doing. Do cats normally
like salad? It wasn't salmon-flavored or anything crazy like that.

I can think of three reasons why a cat might eat salad:
1. The cat is insane
2. The cat likes mice, and just last week we noticed that a mice was eating some
salad, so maybe the cat decided to shortcut the food chain and eat a salad
instead of getting the nutrients of the salad from the consumption of the mouse.
3. Cats are weird.
"""

我发现了几个问题,解决了在正好N个字符处添加换行符的问题(坦率的琐碎)。我知道我可以拆分空格并计数个字符,并在行超过80个字符时回溯以添加换行符,但这很乏味,而且不是我正在寻找的“优雅”解决方案; p

但是,如果我找不到更好的方法,我会走那条路线。...我想。

我的直觉告诉我,使用正则表达式和前瞻性/后视性是一个很好的解决方案。

这是我到目前为止所拥有的:

content = """
this is some content with words and stuff
and here is another line things
"""
content = content.gsub(%r{(.{10}) }, "\\1\n")
puts content

哪个输出:

this is some
content with
words and stuff
and here is
another line
things

但是,它使行仅在10个字符以上,而不是在其下方。

3 个答案:

答案 0 :(得分:1)

我最终选择了遍历单词的路径:

fixupmessages

这会在单词之前插入换行符,否则会导致该行超过def wordwrap(content, line_length) words = content.scan(/(?:\A|\s)[^\s]*/) remaining = line_length words.each do |word| if word.length > remaining word.gsub!(/^\s/, "") remaining = line_length - word.length word.insert(0, "\n") else if word =~ /^\n/ remaining = line_length - word.length - 1 else remaining -= word.length end end end words.join end 个字符。

这比我希望的要混乱一些,但是可以完成工作。

答案 1 :(得分:1)

Rails方式:

puts content.gsub(/(.{1,10})(?:\s+|$)/, "\\1\n")
# >>this is
# >>some
# >>content
# >>with words
# >>and stuff
# >>and here
# >>is another
# >>line
# >>things

Cf。 https://apidock.com/rails/ActionView/Helpers/TextHelper/word_wrap

答案 2 :(得分:0)

r = /.{,80}[\n ]/ 

puts content.gsub(r) { |s| s[0..-2] << "\n" }

显示以下内容:

Mittens the cat ate a salad on Friday morning. He's a cat, so I'm not really
sure why he was eating a salad, but that's what he was doing. Do cats normally
like salad? It wasn't salmon-flavored or anything crazy like that.

I can think of three reasons why a cat might eat salad:
1. The cat is insane.
2. The cat likes mice, and just last week we noticed that a mice was eating some
salad, so maybe the cat decided to shortcut the food chain and eat a salad
instead of getting the nutrients of the salad from the consumption of the mouse.
3. Cats are weird.

正则表达式最多匹配80个字符,后跟换行符或空格。不管匹配的最后一个字符是空格还是换行符,该块中的最后一个字符都会被换行符代替。

要使此方法起作用,字符串可能不能连续包含多个空格,并且字符串必须以换行符(即"...weird.\n")结尾。如果这些条件不能满足要求,则可以采取简单的预处理步骤:

mod_content = content.squeeze(' ').chomp << "\n"