我有一个大段落字符串,我试图使用JavaScript的.split()
方法将其拆分成句子。我需要一个匹配句点或问号[?.]
后跟空格的正则表达式。但是,我需要在结果数组中保留句点/问号。如果没有JS的正面观察,我怎么能这样做?
编辑:示例输入:
"This is sentence 1. This is sentence 2? This is sentence 3."
示例输出:
["This is sentence 1.", "This is sentence 2?", "This is sentence 3."]
答案 0 :(得分:1)
忘掉split()。你想要match()
var text = "This is an example paragragh. Oh and it has a question? Ok it's followed by some other random stuff. Bye.";
var matches = text.match(/[\w\s'\";\(\)\,]+(\.|\?)(\s|$)/g);
alert(matches);
生成的匹配数组包含每个句子:
Array[4]
0:"This is an example paragragh. "
1:"Oh and it has a question? "
2:"Ok it's followed by some other random stuff. "
4:"Bye. "
以下是进一步测试的小提琴:https://jsfiddle.net/uds4cww3/
也编辑为匹配行尾。
答案 1 :(得分:1)
这个正则表达式将起作用
([^?.]+[?.])(?:\s|$)
<强> Regex Demo 强>
JS Demo
<强> Ideone Demo 强>
var str = 'This is sentence 1. This is sentence 2? This is sentence 3.';
var regex = /([^?.]+[?.])(?:\s|$)/gm;
var m;
while ((m = regex.exec(str)) !== null) {
document.writeln(m[1] + '<br>');
}
&#13;
答案 2 :(得分:0)
答案 3 :(得分:0)
这很俗气,但确实有效:
var breakIntoSentences = function(s) {
var l = [];
s.replace(/[^.?]+.?/g, a => l.push(a));
return l;
}
breakIntoSentences("how? who cares.")
["how?", " who cares."]
(真的是它如何工作:RE匹配一串非标点符号,然后是某些东西。由于匹配是贪婪的,所以某些东西是标点符号或字符串结尾。)
这只会捕获一系列标点符号中的第一个,因此breakIntoSentences("how???? who cares...")
也会返回["how?", " who cares."]
。如果要捕获所有标点符号,请改为使用/[^.?]+[.?]*/g
作为RE。
编辑:哈哈哈:Wavvves教我match()
,这就是替换/推送的作用。你知道每个该死的日子都知道的事情。
以最小的形式,支持三个标点符号,并使用ES6语法,我们得到:
const breakIntoSentences = s => s.match(/[^.?,]+[.?,]*/g)
答案 4 :(得分:0)
我猜.match
会这样做:
(?:\s?)(.*?[.?])
即:
sentence = "This is sentence 1. This is sentence 2? This is sentence 3.";
result = sentence.match(/(?:\s?)(.*?[.?])/ig);
for (var i = 0; i < result.length; i++) {
document.write(result[i]+"<br>");
}