这个问题有可能解决方案吗?
我想要一个忽略tr标签中所有td标签的正则表达式。 我正在寻找的tr标签是不正确的,因为结束标签缺少 “/”。到目前为止,我有:
<tr[^>]*><td(?:(?!</td>).)*</td><tr[^>]*>
<tr[^>]*> This needs to be the beginning of the expression ****
<td(?:(?!</td>).)*</td> This will find everything between <td> and </td>
<tr[^>]*> This needs to be the end of the expression ****
这个正则表达式当然不起作用。以下是运行正则表达式的文本示例:
样本1:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>
</title>
</head>
<body>
<table asdf>
<tr asdf>
<td asdf>
<table asdf>
<tr asdf: asdf>
<td>
blah blah blah
</td>
</tr>
</table>
</td>
<td>
Keep going
</td>
<tr> If highlighted to here from first tr tag than correct regex was used
</table>
</body>
</html>
样本2:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>
</title>
</head>
<body>
<table asdf>
<tr asdf>
<td asdf>
<table asdf>
<tr asdf: asdf>
<td>
blah blah blah
</td>
</tr>
</table>
</td>
<td>
<table asdf>
<tr asdf: asdf>
<td>
blah blah blah
</td>
</tr>
</table>
</td>
<tr> If highlighted to here from first tr tag than correct regex was used
</table>
</body>
</html>
样本3:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>
</title>
</head>
<body>
<table asdf>
<tr asdf>
<td asdf>
<table asdf>
<tr asdf: asdf>
<td>
blah blah blah
</td>
</tr>
</table>
</td>
<td>
<table>
<tr>
<td>
blah blah blah
</td>
</tr>
</table>
</td>
<tr> If highlighted to here from first tr tag than correct regex was used
</table>
</body>
</html>
样本4:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>
</title>
</head>
<body>
<table>
<tr>
<td> </td>
</tr>
</table>
<br/>
<br/>
<br/>
<table class="afdadsf">
<td></td>
</table>
<br/>
<br/>
<table class="fdafdas">
<tr><td></td>
</tr>
</table>
</body>
</html>
我想要的输出是在执行正则表达式时,使用上面的两个示例文本突出显示第一个tr标记直到最后一个tr标记。假设其他示例文本中td标记可能包含任何值。