Regex - get multiple captures at multiple lines and their positions

时间:2016-04-07 10:41:55

标签: c++ regex

I've been trying to do this for several days but still I don't get the right result.

I need to get all captures from a whole string (including newlines) but don't return after first match. I have this regex: (\d+)\s*;. The result should be only numbers (not the spaces and semicolon after) and it should take them from the whole input string (which is multiline).

So, for example, from this input:

5;
6 ;
 7;
 8 ;

It should return "5", "6", "7" and "8".

Anyone know how to do that in c++?

Thanks!


EDIT:

From links mentioned in comments, I came up with this code:

std::regex regex(regex_str, std::regex_constants::icase);

for (std::sregex_iterator it = std::sregex_iterator(input.begin(), input.end(), regex); it != std::sregex_iterator(); ++it)
{
    std::smatch m = *it;

    std::cout << "\t" << m[1].str() << std::endl;
}

It works just fine but there is one thing I forgot to mention: I also need to get the position of every capture. Using m.position() I can get the position of the whole match, but if there's something before the capturing group I'm looking for, it tells me an incorrect position. So, how can I get a position of the capture?

1 个答案:

答案 0 :(得分:0)

您可以使用regex_token_iterator来实现目标:

#include <regex>
#include <iostream>
using namespace std;
int _tmain(int argc, _TCHAR* argv[])
{
    std::string strExample = "5\n 6\n\t7 ; 8; 10";
    std::regex re("\\d+");

    for (std::sregex_token_iterator it(strExample.cbegin(), strExample.cend(), re); it != std::sregex_token_iterator(); ++it)
    {
        std::cout << *it << endl;
    }

    return 0;
}