我可以使用C++ std::regex
提取包含此片段的四行字符串 std::regex table("(<table id.*\n.*\n.*\n.*>)");
const std::string format="$&";
std::cout <<
std::regex_replace(tidy_string(/* */)
,table
,format
,std::regex_constants::format_no_copy
|std::regex_constants::format_first_only
)
<< '\n';
tidy_string()
返回std::string
,代码会生成此输出:
<table id="creditPolicyTable" class=
"table table-striped table-condensed datatable top-bold-border bottom-border"
summary=
"This table of Credit Policy gives credit information (column headings) for list of exams (row headings).">
如何匹配具有不同行数而不是四行的文本?例如:
<table id="creditPolicyTable" summary=
"This table of Credit Policy gives credit information (column headings) for list of exams (row headings).">
或:
<table id="creditPolicyTable"
class="table table-striped table-condensed datatable top-bold-border bottom-border"
summary="This table of Credit Policy gives credit information (column headings) for list of exams (row headings)."
more="x"
even_more="y">
答案 0 :(得分:0)
您应该使用std :: regex_search并懒惰地搜索除“&gt;”之外的任何内容字符。像这样:
#include <iostream>
#include <regex>
int main() {
std::string lines[] = {"<table id=\"creditPolicyTable\" class=\"\
table table-striped -table-condensed datatable top-bold-border bottom-border\"\
summary=\
\"This table of Credit Policy gives credit information (column headings) for list of exams (row headings).\">",
"<table id=\"creditPolicyTable\" summary=\
\"This table of Credit Policy gives credit information (column headings) for list of exams (row headings).\"\
more=\"x\"\
even_more=\"y\">"};
std::string result;
std::smatch table_match;
std::regex table_regex("<table\\sid=[^>]+?>");
for (const auto& line : lines){
if (std::regex_search(line, table_match, table_regex)) {
for (size_t i = 0; i < table_match.size(); ++i)
std::cout << "Match found " << table_match[i] << '\n';
}
}
}