我正在努力纠正工作中几个页面上的验证问题。由于上次审核了这些网页,因此对表格进行编码的正确方法是将</thead>
放在<tbody>
和<tfoot>
之间。从那时起,决定验证的权力决定</tbody>
将追溯</thead>
。有些页面有几十个需要移动的表格。
我想知道是否有办法捕捉<tbody>
和</tbody>
之间的所有内容,并在<tfoot>
<tr>
<td class="small text-left" colspan="6">Source: Mintel <abbr title="Global New Products Database">GNPD</abbr>, 2015.<br>
Note: rankings are based on 2014 data and <abbr title="Global New Products Database">GNPD</abbr> search was based solely
on products that contained a form of the word "flax."</td>
</tr>
</tfoot>
之后移动它。我对正则表达式有基本的了解,但我无法弄清楚如何让它找到页脚内容。并且所有页脚的内容都不一样。
示例:
<tfoot>
<tr>
<td class="small text-left" colspan="5">Source: Global Trade Atlas,
2015 <br>
Compound Annual Growth Rate (CAGR) </td>
</tr>
</tfoot>
并且
<table class="table table-bordered text-right table-condensed mrgn-tp-lg">
<caption>Top 5 Pet Food Companies Worldwide in 2014, US$</caption>
<thead>
<tr>
<th scope="col" class="active text-center">Company</th>
<th scope="col" class="active text-center">International Sales</th>
<th scope="col" class="active text-center">Sales in the EU</th>
</tr>
</thead>
<tfoot>
<tr>
<td class="small text-left" colspan="3">Source: Euromonitor International, 2015</td>
</tr>
</tfoot>
<tbody>
<tr>
<td><b>1. Mars <abbr title="Incorporated">Inc.</abbr></b></td>
<td>$17.8 billion</td>
<td>$5.7 billion</td>
</tr>
<tr>
<td><b>2. Nestlé <abbr lang="fr" xml:lang="fr" title="Société Anonym">SA</abbr></b></td>
<td>$16.8 billion</td>
<td>$4.1 billion</td>
</tr>
<tr>
<td><b>3. Colgate-Palmolive <abbr title="Company">Co</abbr></b></td>
<td>$3.7 billion</td>
<td>$0.7 billion</td>
</tr>
<tr>
<td><b>4. Big Heart Pet Brand</b></td>
<td>$2.9 billion</td>
<td>Not available (N/A)</td>
</tr>
<tr>
<td><b>5. Blue Buffalo <abbr title="Company">Co</abbr> <abbr title="Limited">Ltd</abbr></b></td>
<td>$1.4 billion</td>
<td>N/A</td>
</tr>
</tbody>
</table>
基本上,转过这样的话:
<table class="table table-bordered text-right table-condensed mrgn-tp-lg">
<caption>Top 5 Pet Food Companies Worldwide in 2014, US$</caption>
<thead>
<tr>
<th scope="col" class="active text-center">Company</th>
<th scope="col" class="active text-center">International Sales</th>
<th scope="col" class="active text-center">Sales in the EU</th>
</tr>
</thead>
<tbody>
<tr>
<td><b>1. Mars <abbr title="Incorporated">Inc.</abbr></b></td>
<td>$17.8 billion</td>
<td>$5.7 billion</td>
</tr>
<tr>
<td><b>2. Nestlé <abbr lang="fr" xml:lang="fr" title="Société Anonym">SA</abbr></b></td>
<td>$16.8 billion</td>
<td>$4.1 billion</td>
</tr>
<tr>
<td><b>3. Colgate-Palmolive <abbr title="Company">Co</abbr></b></td>
<td>$3.7 billion</td>
<td>$0.7 billion</td>
</tr>
<tr>
<td><b>4. Big Heart Pet Brand</b></td>
<td>$2.9 billion</td>
<td>Not available (N/A)</td>
</tr>
<tr>
<td><b>5. Blue Buffalo <abbr title="Company">Co</abbr> <abbr title="Limited">Ltd</abbr></b></td>
<td>$1.4 billion</td>
<td>N/A</td>
</tr>
</tbody>
<tfoot>
<tr>
<td class="small text-left" colspan="3">Source: Euromonitor International, 2015</td>
</tr>
</tfoot>
</table>
进入这个:
//static int initialized;
void print(struct student *arg) {
#ifndef first_call
#define first_call 1
//if (!initialized) {
//initialized = 1;
printf("sizeof(*arg1): %lu\n", sizeof(*arg));
//}
#endif
...
}
答案 0 :(得分:0)
您可以使用
(</thead>\s*)([\s\S]*?)\s*(<tbody>[\s\S]*?</tbody>)
请参阅regex demo。替换为$1$3\n$2
。
<强>详情
(</thead>\s*)
- 第1组:</thead>
子字符串和0+空格(\s*
)([\s\S]*?)
- 第2组:任意0个字符,尽可能少,直到后续子图案的最左边出现\s*
- 0+ whitespaces (<tbody>[\s\S]*?</tbody>)
- 第3组:
<tbody>
- <tbody>
子字符串[\s\S]*?
- 任意0个字符,尽可能少,直至最左边的... </tbody>
- </tbody>
子字符串。 $1$3\n$2
将匹配替换为Group 1值,然后替换Group 3值,然后插入换行符,然后插入Group 2值。