正则表达式,用于使用捕获组匹配重复[01]

时间:2019-05-11 19:43:23

标签: regex regex-lookarounds regex-group regex-greedy

我有一个可变长度的值字符串(实际位:1和0,是32的倍数)。例如:

010011011001110111100111011010001001100011101100100011100010100011110010100011001111111101101001

32位块中的每个块都包含内部结构:前8位和后24位属于gehter。

我喜欢

  • 获取每个32位块,然后
  • 每个块的内部结构

在一个正则表达式中。

我的方法

^(([01]{8})([01]{24})){0,}$

无法解决,因为它仅匹配最后一个块。

这样的正则表达式可能吗?要找什么?我做错了什么?

2 个答案:

答案 0 :(得分:2)

我已经使用this tool对其进行了轻微修改:

(([0-1]{8})([0-1]{24}))

如果我理解正确,那么您可能不想将其与开始和结束字符绑定。您可以简单地在其周围使用另一个捕获组,并与已经拥有的其他两个捕获组一起按需要提取数据。

u

RegEx描述图

link可帮助您形象化表情:

enter image description here

JavaScript测试演示

const regex = /(([0-1]{8})([0-1]{24}))/gm;
const str = `010011011001110111100111011010001001100011101100100011100010100011110010100011001111111101101001
`;
const subst = `Group #1: $1\nGroup #2: $2\nGroup #3: $3\n`;

// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);

console.log('Substitution result: ', result);

性能测试

此代码段返回一百万次for循环的运行时间。

const repeat = 1000000;
const start = Date.now();

for (var i = repeat; i >= 0; i--) {
	const regex = /(([0-1]{8})([0-1]{24}))/gm;
	const str = `010011011001110111100111011010001001100011101100100011100010100011110010100011001111111101101001`;
	const subst = `\nGroup #1: $1\nGroup #2: $2\nGroup #3: $3`;

	var match = str.replace(regex, subst);
}

const end = Date.now() - start;
console.log("YAAAY! \"" + match + "\" is a match  ");
console.log(end / 1000 + " is the runtime of " + repeat + " times benchmark test.  ");

答案 1 :(得分:2)

在Java中,您一次可以匹配一次。

代码

// \G matches only exactly where the previous `find()` left off
// (?:^|\G) matches either at start of line or where previous `find()` left off
Pattern p = Pattern.compile("(?:^|\G)([01]{8})([01]{24})");
// inputString should not contain e.g. newline characters
Matcher m = p.matcher(inputString);
boolean lastMatchEnd = 0;
while (m.find()) {
    String firstPart = m.group(1);
    String secondPart = m.group(2);
    // ...
    // remember how far we got
    lastMatchEnd = m.end();
}
if (lastMatchEnd != inputString.length()) {
  // if we get here, there were garbage in the line that did not match
}