最小代码示例如下:
#include <cstdlib>
#include <iostream>
#include <vector>
#include <regex.h>
using namespace std;
class regex_result {
public:
/** Contains indices of starting positions of matches.*/
std::vector<int> positions;
/** Contains lengths of matches.*/
std::vector<int> lengths;
};
regex_result match_regex(string regex_string, const char* string) {
regex_result result;
regex_t* regex = new regex_t;
regcomp(regex, regex_string.c_str(), REG_EXTENDED);
/* "P" is a pointer into the string which points to the end of the
previous match. */
const char* pointer = string;
/* "n_matches" is the maximum number of matches allowed. */
const int n_matches = 10;
regmatch_t matches[n_matches];
int nomatch = 0;
while (!nomatch) {
nomatch = regexec(regex, pointer, n_matches, matches, 0);
if (nomatch)
break;
for (int i = 0; i < n_matches; i++) {
int start,
finish;
if (matches[i].rm_so == -1) {
break;
}
start = matches[i].rm_so + (pointer - string);
finish = matches[i].rm_eo + (pointer - string);
result.positions.push_back(start);
result.lengths.push_back(finish - start);
}
pointer += matches[0].rm_eo;
}
delete regex;
return result;
}
int main(int argc, char** argv) {
string str = "this is a test";
string pat = "this";
regex_result res = match_regex(pat, str.c_str());
cout << res.positions.size() << endl;
return 0;
}
所以我编写了一个函数来解析给定字符串的正则表达式匹配。结果保存在一个基本上是两个向量的类中,一个用于匹配的位置,另一个用于相应的匹配长度。
这样可以正常工作,但是当我在它上面运行valgrind
时,它会显示一些实质性的内存泄漏。
在上面的代码中使用valgrind --leak-check=full
时,我得到:
==24843== Memcheck, a memory error detector
==24843== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==24843== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
==24843== Command: ./test
==24843==
1
==24843==
==24843== HEAP SUMMARY:
==24843== in use at exit: 11,688 bytes in 37 blocks
==24843== total heap usage: 54 allocs, 17 frees, 12,868 bytes allocated
==24843==
==24843== 256 bytes in 1 blocks are definitely lost in loss record 14 of 18
==24843== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==24843== by 0x543549A: regcomp (regcomp.c:487)
==24843== by 0x400ED0: match_regex(std::string, char const*) (in <path>)
==24843== by 0x4010CA: main (in <path>)
==24843==
==24843== 11,432 (224 direct, 11,208 indirect) bytes in 1 blocks are definitely lost in loss record 18 of 18
==24843== at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==24843== by 0x4C2CF1F: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==24843== by 0x5434BAF: re_compile_internal (regcomp.c:760)
==24843== by 0x54354FF: regcomp (regcomp.c:506)
==24843== by 0x400ED0: match_regex(std::string, char const*) (in <path>)
==24843== by 0x4010CA: main (in <path>)
==24843==
==24843== LEAK SUMMARY:
==24843== definitely lost: 480 bytes in 2 blocks
==24843== indirectly lost: 11,208 bytes in 35 blocks
==24843== possibly lost: 0 bytes in 0 blocks
==24843== still reachable: 0 bytes in 0 blocks
==24843== suppressed: 0 bytes in 0 blocks
==24843==
==24843== For counts of detected and suppressed errors, rerun with: -v
==24843== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
我的代码是错误的还是这些文件中确实存在错误?
答案 0 :(得分:5)
您的regex_t
管理层不需要是动态的,虽然这与您的问题没有直接关系,但这有点奇怪。真正的问题是,如果编译成功,你永远不会regfree()
你的结果表达 (你应该验证)。您应该像这样设置正则表达式:
regex_t regex;
int res = regcomp(®ex, regex_string.c_str(), REG_EXTENDED);
if (res == 0)
{
// use your expression via ®ex
....
// and eventually free it when done.
regfree(®ex);
}
如果你的实现支持它们,我强烈建议使用C ++ 11提供的<regex>
库,因为它有很好的RAII解决方案。
答案 1 :(得分:3)
您必须致电regfree()以释放由regcomp()
分配的内存。