我正在寻找一个python模块或一些现有的python代码,可用于包装使用">"行前缀表示引用的文本(参见下面的示例)。
我知道我可以使用python textwrap 模块来包装文本段落。但是,该模块并不了解这种引用前缀。
我知道如何编写将执行此文本包装的例程,并且我不寻求有关如何编写它的建议。相反,我想知道是否有人知道已经存在的任何python代码或python模块,并且已经能够在电子邮件类型的引用文本上执行这种包装。
我一直在寻找,但我在python中找不到任何东西。
我只是不想重新发明轮子"如果这样的事情已经写好了。
以下是我想要执行的文字换行的示例。假设我有以下来自假设电子邮件的文本:
Abc defg hijk lmnop.
Mary had a little lamb.
Her fleas were white as snow,
> Now is the time for all good men to come to the aid of their party.
>
> The quick
> brown fox jumped over the lazy sleeping dog.
>> When in the Course of human
>> events it
>> becomes necessary for one people to dissolve the political
>> bands
>> which have
>> connected them ...
and everywhere that Mary went,
her fleas were sure to go
... and to reproduce.
> What do you mean by this?
>> with another
>> and to assume among
>> the powers of the earth ...
> Doo wah diddy, diddy dum, diddy doo.
>> Text text text text text text text text text text text text text text text text text text text text text text text text text text text.
假设我想在第52列换行,结果文本应如下所示:
Abc defg hijk lmnop.
Mary had a little lamb. Her fleas were white as
snow,
> Now is the time for all good men to come to the
> aid of their party.
>
> The quick brown fox jumped over the lazy sleeping
> dog.
>> When in the Course of human events it becomes
>> necessary for one people to dissolve the
>> political bands which have connected them ...
and everywhere that Mary went, her fleas were
sure to go ... and to reproduce.
> What do you mean by this?
>> with another and to assume among the powers of
>> the earth ...
> Doo wah diddy, diddy dum, diddy doo.
>> Text text text text text text text text text text
>> text text text text text text text text text text
>> text text text text text text text.
感谢您对现有python代码的任何引用。
如果不存在这样的事情并且#34;在野外",我会写下这个并在此处发布我的代码。
非常感谢。
答案 0 :(得分:0)
我无法找到包含此类引用文本的现有代码,所以这里是我编写的代码。它使用 re 和 textwrap 模块。
我将代码分解为"段落"基于初始引号或缩进字符的数量。然后我使用 textwrap 来打包每个"段落#34;从每行中删除引用或缩进前缀。在换行之后,我将前缀重新添加到"段落"。
的每一行有一天,我会清理代码并使其更加优雅,但至少它似乎工作得很好。
import re
import textwrap
def wrapemail(text, wrap=72):
if not text:
return ''
prefix = None
prev_prefix = None
paragraph = []
paragraphs = []
for line in text.rstrip().split('\n'):
line = line.rstrip()
m = wrapemail.qprefixpat.search(line)
if m:
prefix = wrapemail.whitepat.sub('', m.group(1))
text = m.group(2)
if text and wrapemail.whitepat.search(text[0]):
prefix += text[0]
text = text[1:]
else:
m = wrapemail.wprefixpat.search(line)
if m:
prefix = m.group(1)
text = m.group(2)
else:
prefix = ''
text = line
if not text:
if paragraph and prev_prefix is not None:
paragraphs.append((prev_prefix, paragraph))
paragraphs.append((prefix, ['']))
prev_prefix = None
paragraph = []
elif prefix != prev_prefix:
if paragraph and prev_prefix is not None:
paragraphs.append((prev_prefix, paragraph))
prev_prefix = prefix
paragraph = []
paragraph.append(text)
if paragraph and prefix is not None:
paragraphs.append((prefix, paragraph))
result = ''
for paragraph in paragraphs:
prefix = paragraph[0]
text = '\n'.join(paragraph[1]).rstrip()
wraplen = wrap - len(prefix)
if wraplen < 1:
result += '{}{}\n'.format(prefix, text)
elif text:
for line in textwrap.wrap(text, wraplen):
result += '{}{}\n'.format(prefix, line.rstrip())
else:
result += '{}\n'.format(prefix)
return result
wrapemail.qprefixpat = re.compile(r'^([\s>]*>)([^>]*)$')
wrapemail.wprefixpat = re.compile(r'^(\s+)(\S.*)?$')
wrapemail.whitepat = re.compile(r'\s')
将原始邮件中的文字提供给它,并使用&#39; wrap&#39;指定为52确实会产生我在上面指定的输出。
随意改进或窃取它。 :)