Question

我正在使用python3，尝试在句点之后拆分带有注释编号的文本：

text = "Reproduction now becomes posited as “natural” production.16 Fortunati joins Marx in a minute but crucial declension from usevalue to nonvalue. "

这是我所接受的最接近的句子分裂正则表达式仍然有效：

sentences = re.split(r' *[\.\?!][\'"\)\]]* +', text)

我基本上失去了w / r / t通过正则表达式在一段时间之后立即捕获数字实例。将[0-9]正确纳入表达式的任何帮助？感谢。

编辑这是理想分割的方式：

sentences[0]= "Reproduction now becomes posited as “natural” production.16"
sentences[1]= " Fortunati joins Marx in a minute but crucial declension from usevalue to nonvalue."

Answer 1

使用re.findall：

>>> import regex
>>> regex.split(r'(?<=\.\d+\b)', text, flags=regex.VERSION1)
['Reproduction now becomes posited as “natural” production.16',
 ' Fortunati joins Marx in a minute but crucial declension ...']

如果你可以使用第三方模块，你可以使用regex，它允许非固定宽度的环视声明，拆分为空字符串：

'Collapse this branch');
    $('.tree li.parent_li > span').on('click', function (e) {
        var children = $(this).parent('li.parent_li').find(' > ul > li');
        if (children.is(":visible")) {
            children.hide('fast');
            $(this).attr('title', 'Expand this branch').find(' > i').addClass('icon-plus-sign').removeClass('icon-minus-sign');
        } else {
            children.show('fast');
            $(this).attr('title', 'Collapse this branch').find(' > i').addClass('icon-minus-sign').removeClass('icon-plus-sign');
        }
        e.stopPropagation();
    });
});

句点后用数字字符分割句子

1 个答案: