我正在尝试在字符串中的所有<tr> </tr>
标记之间提取文本并打印它们。
use strict;
use warnings;
my $HTML = '
<tr data_1="15,12,2016" data_2="1">
<td class="cl_1">11111</td>
<td class="cl_2">11111</td>
<td class="cl_3"><strong>11111</strong></td>
<td class="cl_4" colspan="3">11111</td>
</tr>
<tr data_1="16,12,2016" data_2="0">
<td class="cl_1">22222</td>
<td class="cl_2">22222</td>
<td class="cl_3"><strong>22222</strong></td>
<td class="cl_4" colspan="3">22222</td>
</tr>
<tr data_1="15,12,2016" data_2="1">
<td class="cl_1">33333</td>
<td class="cl_2">33333</td>
<td class="cl_3"><strong>33333</strong></td>
<td class="cl_4" colspan="3">33333</td>
</tr>
';
while($HTML =~ /data_2="1">(.*)<\/tr>(\R)/sg) {
print "$1\n\n";
}
输出应为:
<td class="cl_1">11111</td>
<td class="cl_2">11111</td>
<td class="cl_3"><strong>11111</strong></td>
<td class="cl_4" colspan="3">11111</td>
<td class="cl_1">33333</td>
<td class="cl_2">33333</td>
<td class="cl_3"><strong>33333</strong></td>
<td class="cl_4" colspan="3">33333</td>
如何执行此操作并从每个<tr>
标记中提取内容?
答案 0 :(得分:0)
编辑答案以包含需要<tr>'s
的新限制
my $HTML = '
<tr data_1="15,12,2016" data_2="1">
<td class="cl_1">11111</td>
<td class="cl_2">11111</td>
<td class="cl_3"><strong>11111</strong></td>
<td class="cl_4" colspan="3">11111</td>
</tr>
<tr data_1="16,12,2016" data_2="0">
<td class="cl_1">22222</td>
<td class="cl_2">22222</td>
<td class="cl_3"><strong>22222</strong></td>
<td class="cl_4" colspan="3">22222</td>
</tr>
<tr data_1="15,12,2016" data_2="1">
<td class="cl_1">33333</td>
<td class="cl_2">33333</td>
<td class="cl_3"><strong>33333</strong></td>
<td class="cl_4" colspan="3">33333</td>
</tr>
';
while($HTML =~ /<tr[^>]*data_2=\"1\"[^>]*>(.*?)<\/tr>/msg) {
print "$1\n\n"; }
输出:
<td class="cl_1">11111</td>
<td class="cl_2">11111</td>
<td class="cl_3"><strong>11111</strong></td>
<td class="cl_4" colspan="3">11111</td>
<td class="cl_1">33333</td>
<td class="cl_2">33333</td>
<td class="cl_3"><strong>33333</strong></td>
<td class="cl_4" colspan="3">33333</td>