在Python中添加缺少句点到换行符

时间:2016-12-30 20:51:18

标签: python regex string

我想在Python中编写一个使用以下内容的正则表达式

转换所有以[alphanumeric]\n结尾的句子,并将其替换为"。"

所以例如

I went there
I also went there.
It is


That-
I too went there!
It went there?
It is 3

所以,让我们说

应转换为

I went there.
I also went there.
It is.


That-
I too went there!
It went there?
It is 3.

我该怎么做?

编辑:输入字符串是

s = "I went there\nI also went there.\nIt is\n\nThat-\nI too went there!\nIt went there?\nIt is 3"

此外,"?"不应附加"。"

EDIT2:我修改了示例,因此它包含一个双\n和一个以-结尾的句子。所以" - "不应附加"。"

2 个答案:

答案 0 :(得分:4)

尝试这样的事情:

s = 'I went there'

if s[-1] not in ['!', ',', '.', '\n']:
    s += '.'

修改

使用您的新输入,以下内容应该有效:

new_string = ''.join('{}.\n'.format(item) if (item and item[-1] not in '!?.,-') else '{}\n'.format(item) for item in s.split('\n'))

如果您不想在\n结尾处new_string,则可以将其删除:

new_string = new_string.rstrip('\n')

<强>输出:

>>> s = "I went there\nI also went there.\nIt is\n\nThat-\nI too went there!\nIt went there?\nIt is 3"
>>> new_string = ''.join('{}.\n'.format(item) if (item and item[-1] not in '!?.,-') else '{}\n'.format(item) for item in s.split('\n'))
>>>
>>> print(new_string)
I went there.
I also went there.
It is.

That-
I too went there!
It went there?
It is 3.

答案 1 :(得分:0)

使用re.sub()函数和特定正则表达式模式的简单解决方案:

s = "I went there\nI also went there.\nIt is\n\nThat-\nI too went there!\nIt went there?\nIt is 3"
s = re.sub(r'(?<=[^,.!?-\s])(\n|$)', r'.\1', s, re.M)
print(s)

输出:

I went there.
I also went there.
It is.

That-
I too went there!
It went there?
It is 3.

(\n|$) - 匹配换行符(新行)或字符串

的结尾

(?<=[^,.!?-\s]) - 确保在上述匹配

之前没有特定字符

\1 - 指的是第一个捕获组(\n|$)