当使用Rails ActionView :: Helpers :: SanitizeHelper清理过的html时,如果有多个“ wbr”标签,它就会失败。
从其他应用程序粘贴的文本包含的HTML实际上具有数百个“ wbr”元素。
当“ wbr”元素和出现的外部元素的组合“深度”达到255时,文档中的所有其他文本似乎都将被删除。
这可能意味着重要信息丢失。
例如,如果对以下片段运行sanize:
<div>
Some text here
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr><wbr>
<wbr><wbr><wbr><wbr><wbr>
More important text here
</div>
结果不包含第二段文字。即结果是:
<div>
Some text here
</div>
我正在寻找一种安全地清理从其他应用程序粘贴的html而不丢失任何内容的方法。
我显然可以在清理之前使用gsub用空格替换所有'wbr'元素,但我想知道没有其他场景会以相同的方式造成数据丢失。
请注意,如果您尝试从示例段中删除“ wbr”元素,那么Rails :: Html :: TargetScrubber也会遇到类似的问题,然后它也会删除最后一个文本。