我将一个字符串(HTML)发布到服务器端,然后使用HTMLAgility包验证它。在HTML中有一个未闭合的colgroup标记。
清理后,关闭的colgroup标签出现在关闭“tbody”和“table”标签之间
在:
<table width="3265" class="mce-item-table" style="width: 2452pt; border-collapse: collapse;" border="0" cellspacing="0" cellpadding="0">
<colgroup><col width="80" style="width: 60pt;">
<col width="245" style="width: 184pt;" span="13"> <!-- MISSING COLGROUP tag-->
<tbody><tr height="20" style="height: 15pt;">
<td width="80" height="20" style="width: 60pt; height: 15pt; color: blue; text-decoration: underline; text-underline-style: single;"><span style="color: blue;">31109173</span></td>
<td width="245" style="width: 184pt; font-family: Arial; font-size: 9pt;">31109173</td>
<td width="245" align="right" style="width: 184pt; font-family: Arial; font-size: 9pt;">May 09,2017 9:54 AM</td>
<td width="245" align="right" style="width: 184pt; font-family: Arial; font-size: 9pt;">May 08,2017 5:21 PM</td>
</tr>
<tr height="20" style="height: 15pt;">
<td height="20" style="height: 15pt; color: blue; text-decoration: underline; text-underline-style: single;"><span style="color: blue;">30933775</span></td>
<td style="font-family: Arial; font-size: 9pt;">30933775</td>
<td align="right" style="font-family: Arial; font-size: 9pt;">May 09,2017 9:50 AM</td>
<td align="right" style="font-family: Arial; font-size: 9pt;">Apr 28,2017 6:22 PM</td>
</tr>
</tbody></table>
在:
<table width="3265" class="mce-item-table" style="width: 2452pt; border-collapse: collapse;" border="0" cellspacing="0" cellpadding="0">
<colgroup><col width="80" style="width: 60pt;">
<col width="245" style="width: 184pt;" span="13">
<tbody><tr height="20" style="height: 15pt;">
<td width="80" height="20" style="width: 60pt; height: 15pt; color: blue; text-decoration: underline; text-underline-style: single;"><span style="color: blue;">31109173</span></td>
<td width="245" style="width: 184pt; font-family: Arial; font-size: 9pt;">31109173</td>
<td width="245" align="right" style="width: 184pt; font-family: Arial; font-size: 9pt;">May 09,2017 9:54 AM</td>
<td width="245" align="right" style="width: 184pt; font-family: Arial; font-size: 9pt;">May 08,2017 5:21 PM</td>
</tr>
<tr height="20" style="height: 15pt;">
<td height="20" style="height: 15pt; color: blue; text-decoration: underline; text-underline-style: single;"><span style="color: blue;">30933775</span></td>
<td style="font-family: Arial; font-size: 9pt;">30933775</td>
<td align="right" style="font-family: Arial; font-size: 9pt;">May 09,2017 9:50 AM</td>
<td align="right" style="font-family: Arial; font-size: 9pt;">Apr 28,2017 6:22 PM</td>
</tr>
</tbody></colgroup></table>
<!-- ^^ </colgroup> has appeared above-->
我尝试将“OptionFixNestedTags”标志设置为true。我仍然得到相同的结果。
答案 0 :(得分:0)
我尝试了HTMLAgility包中的各种选项并将其设置为true。这没用。
OptionFixNestedTags = true;
OptionAutoCloseOnEnd = true;
有一个很好的Nuget包清理html。我遇到的问题在这里解决了 - &gt; HtmlSanitizer
希望这有帮助。