我正在使用python3,尝试在句点之后拆分带有注释编号的文本:
text = "Reproduction now becomes posited as “natural” production.16 Fortunati joins Marx in a minute but crucial declension from usevalue to nonvalue. "
这是我所接受的最接近的句子分裂正则表达式仍然有效:
sentences = re.split(r' *[\.\?!][\'"\)\]]* +', text)
我基本上失去了w / r / t通过正则表达式在一段时间之后立即捕获数字实例。将[0-9]正确纳入表达式的任何帮助?感谢。
编辑这是理想分割的方式:
sentences[0]= "Reproduction now becomes posited as “natural” production.16"
sentences[1]= " Fortunati joins Marx in a minute but crucial declension from usevalue to nonvalue."
答案 0 :(得分:0)
使用re.findall
:
>>> import regex
>>> regex.split(r'(?<=\.\d+\b)', text, flags=regex.VERSION1)
['Reproduction now becomes posited as “natural” production.16',
' Fortunati joins Marx in a minute but crucial declension ...']
如果你可以使用第三方模块,你可以使用regex
,它允许非固定宽度的环视声明,拆分为空字符串:
'Collapse this branch');
$('.tree li.parent_li > span').on('click', function (e) {
var children = $(this).parent('li.parent_li').find(' > ul > li');
if (children.is(":visible")) {
children.hide('fast');
$(this).attr('title', 'Expand this branch').find(' > i').addClass('icon-plus-sign').removeClass('icon-minus-sign');
} else {
children.show('fast');
$(this).attr('title', 'Collapse this branch').find(' > i').addClass('icon-minus-sign').removeClass('icon-plus-sign');
}
e.stopPropagation();
});
});