我是JavaScript中正则表达式的新手,我很难从文本字符串中获取匹配数组,如下所示:
Sentence would go here
-foo
-bar
Another sentence would go here
-baz
-bat
我想得到一系列像这样的比赛:
match[0] = [
'foo',
'bar'
]
match[1] = [
'baz',
'bat'
]
总而言之,我正在寻找的是:
" 任何破折号+字(-foo,-bar等) AFTER 一个句子"
任何人都可以提供捕获所有迭代而不是最后一个迭代的公式,因为重复捕获组只会捕获最后一次迭代。如果这是一个愚蠢的问题,请原谅我。如果有人想给我发一些测试,我会使用regex101
答案 0 :(得分:2)
我想出的第一个正则表达式如下:
/([^-]+)(-\w*)/g
第一组([^-]+)
抓住一切不是破折号的东西。然后,我们按照我们想要的实际捕获组(-\w+)
进行操作。我们添加标志g
以使正则表达式对象跟踪它看起来的最后位置。这意味着,每次运行regex.exec(search)
时,我们都会获得您在regex101中看到的下一场比赛。
注意: JavaScript 的\w
等同于[a-zA-Z0-9_]
。因此,如果您只是想要使用此字母而不是\w
:[a-zA-Z]
以下是实现此正则表达式的代码。
<p id = "input">
Sentence would go here
-foo
-bar
Another sentence would go here
-baz
-bat
</p>
<p id = "output">
</p>
<script>
// Needed in order to make sure did not get a sentence.
function check_for_word(search) {return search.split(/\w/).length > 1}
function capture(regex, search) {
var
// The initial match.
match = regex.exec(search),
// Stores all of the results from the search.
result = [],
// Used to gather results.
gather;
while(match) {
// Create something empty.
gather = [];
// Push onto the gather.
gather.push(match[2]);
// Get the next match.
match = regex.exec(search);
// While we have more dashes...
while(match && !check_for_word(match[1])) {
// Push result on!
gather.push(match[2]);
// Get the next match to be checked.
match = regex.exec(search);
};
// Push what was gathered onto the result.
result.push(gather);
}
// Hand back the result.
return result;
};
var output = capture(/([^-]+)(-\w+)/g, document.getElementById("input").innerHTML);
document.getElementById("output").innerHTML = JSON.stringify(output);
</script>
使用略微修改的正则表达式,您可能会得到更多您正在寻找的内容。
/[^-]+((?:-\w+[^-\w]*)+)/g
[^-\w]*
的额外位允许每个破折号字之间存在某种分隔。然后添加非捕获组(?:)
以允许+
一个或多个破折号。我们也不需要()
周围的[^-]+
,因为您将在下面看到不再需要的数据。第一个是关于什么可以在破折号之间打破更灵活,但我发现这个更干净。
function capture(regex, search) {
var
// The initial match.
match = regex.exec(search),
// Stores all of the results from the search.
result = [],
// Used to gather results.
gather;
while(match) {
// Create something empty.
gather = [];
// Break up the large match.
var temp = match[1].split('-');
for(var i in temp)
{
temp[i] = temp[i].split(/\W*/).join("");
// Makes sure there was actually something to gather.
if(temp[i].length > 0)
gather.push("-" + temp[i]);
}
// Push what was gathered onto the result.
result.push(gather);
// Get the next match.
match = regex.exec(search);
};
// Hand back the result.
return result;
};
var output = capture(/[^-]+((?:-\w+[^-\w]*)+)/g, document.getElementById("input").innerHTML);
document.getElementById("output").innerHTML = JSON.stringify(output);
<p id = "input">
Sentence would go here
-foo
-bar
Another sentence would go here
-baz
-bat
My very own sentence!
-get
-all
-of
-these!
</p>
<p id = "output">
</p>
答案 1 :(得分:1)
Regexp捕获对于无限数量的群组并不能很好地发挥作用。相反,分裂在这里更好用:
var text = document.getElementById('text').textContent;
var blocks = text.split(/^(?!-)/m);
var result = blocks.map(function(block) {
return block.split(/^-/m).slice(1).map(function(line) {
return line.trim();
});
});
document.getElementById('text').textContent = JSON.stringify(result);
&#13;
<div id="text">Sentence would go here
-foo
-bar
Another sentence would go here
-baz
-bat
</div>
&#13;
答案 2 :(得分:1)
只需匹配以-
开头的两行,如果足够,则以换行符开头。
\n-(.*)\r?\n-(.*)
见regex demo at regex101。要获得匹配,请使用exec() method。
var re = /\n-(.*)\r?\n-(.*)/g; var m;
var str = 'Sentence would go here\n-foo\n-bar\nAnother sentence would go here\n-baz\n-bat';
while ((m = re.exec(str)) !== null) {
if (m.index === re.lastIndex) re.lastIndex++;
document.write(m[1] + ',' + m[2] + '<br>');
}