Question

我希望获得与此表达式匹配的所有子字符串：1 [0] +1

std::string str =  "0011011000001";
std::regex rx   ("1[0]+1");
std::smatch res;
std::regex_search(str, res, rx);
for (size_t i=0; i<res.size(); i++)
std::cout<<res[i]<<std::endl;

但它只返回第一个子串。我做错了什么？

Answer 1

你应该这样做以获得所有子字符串：

while (std::regex_search (str,res,rx)) {
    std::cout <<res[0] << std::endl;
    str = res.suffix().str();
}

或者您可以使用std :: regex_iterator获取所有子字符串，如下所示：

std::regex_iterator<std::string::iterator> rit ( str.begin(), str.end(), rx);
std::regex_iterator<std::string::iterator> rend;

while (rit != rend) {
    std::cout << rit->str() << std::endl;
    ++rit;
}

但是当字符串为“00110101000001”时它仍会输出'101'和'1000001'，因为第一个匹配消耗部分字符串。如果要查找所有重叠匹配，则需要支持Lookaround Assertion的正则表达式实现。 Python确实：

>>> re.findall(r'(?=(1[0]+1))', '00110101000001')
['101', '101', '1000001']

（？= ...）匹配if ...匹配next，但不消耗任何字符串。这称为先行断言。例如，Isaac（？= Asimov）只有在跟随'Asimov'后才会匹配'Isaac'。

Answer 2

让比赛变得非贪婪......

std::regex rx   ("(1[0]+1)?");

无法使用C ++中的RegEx从字符串中获取所有子字符串

2 个答案: