我正在尝试在C ++中使用正则表达式来提取与某个单词相匹配的行 - 来自由两个其他模式限定的文件中的区域内。我还想打印每场比赛的行号。
我目前正在使用perl
运行popen
命令,但我想用C ++执行此操作:
perl -ne 'if ((/START/ .. /END/) && /test/) {print "line$.:$_"}' file
此命令在START
和END
之间找到区域,然后从包含单词test
的提取行中找到区域。
如何使用C ++中的正则表达式执行此操作?
答案 0 :(得分:3)
semantics of Perl’s ..
很微妙。以下代码模拟..
和while (<>) { ... }
隐含的-n
切换为perl
。
#include <fstream>
#include <iostream>
#include <regex>
#include <vector>
// emulate Perl's .. operator
void flipflop(bool& inside, const std::regex& start, const std::regex& end, const std::string& str)
{
if (!inside && std::regex_match(str, start))
inside = true;
else if (inside && std::regex_match(str, end))
inside = false;
}
int main(int argc, char *argv[])
{
// extra .* wrappers to use regex_match in order to work around
// problems with regex_search in GNU libstdc++
std::regex start(".*START.*"), end(".*END.*"), match(".*test.*");
for (const auto& path : std::vector<std::string>(argv + 1, argv + argc)) {
std::ifstream in(path);
std::string str;
bool inside = false;
int line = 0;
while (std::getline(in, str)) {
++line;
flipflop(inside, start, end, str);
if (inside && std::regex_match(str, match))
std::cout << path << ':' << line << ": " << str << '\n';
// Perl's .. becomes false AFTER the rhs goes false,
// so keep this last to allow match to succeed on the
// same line as end
flipflop(inside, start, end, str);
}
}
return 0;
}
例如,请考虑以下输入。
test ERROR 1 START test END test ERROR 2 START foo ERROR 3 bar ERROR 4 test 1 baz ERROR 5 END test ERROR 6 START sldkfjsdflkjsdflk test 2 END lksdjfdslkfj START dslfkjs sdflksj test 3 END dslkfjdsf
样品运行:
$ ./extract.exe file file:3: test file:9: test 1 file:14: test 2 file:20: test 3 $ ./extract.exe file file file:3: test file:9: test 1 file:14: test 2 file:20: test 3 file:3: test file:9: test 1 file:14: test 2 file:20: test 3