Question

导入CSV文件时，恶意用户可能会包含<script>个标记并运行任意JavaScript。我有这行代码来过滤掉它们（也就是说，将它们变成无害的div标签）：

data = data.gsub(/<\s*script[^>]*>/i, '<div style="display:none">').gsub(/<\s*\/+script[^>]*>/i, '</div>')

有没有办法打败这种过滤？

Answer 1

还有许多其他方法可以将XSS有效负载转换为字符串，例如，这可以在大多数浏览器中使用（是的，即使已经有另一个body标记）。

<body onload="alert('foo');">

如果要提供格式化选项，则应对所有输入进行HTML编码，并使用Markdown等其他引擎。

Answer 2

如果你想弄乱标签，请使用真正的解析器。

很容易将<script>标签变成其他内容：

require 'nokogiri'

doc = Nokogiri::HTML(<<EOT)
  <html>
  <head>
    <script></script>
  </head>
  <body>
    <script>alert('pop this!')</script>
  </body>
  </html>
EOT

doc.search('script').each do |script|
  script.replace("<div>#{ script.content }</div>")
end

puts doc.to_html
# >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
# >> <html>
# >> <head>
# >> <meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
# >> <div></div>
# >> </head>
# >> <body>
# >>     <div>alert('pop this!')</div>
# >> </body>
# >> </html>

抓取<script>标记的内容并将其插入<div>标记内，然后使用该标记替换<script>标记。

或者，您只需更改<script>代码的名称：

doc.search('script').each do |script|
  script.name = "div"
end

puts doc.to_html
# >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
# >> <html>
# >> <head>
# >> <meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
# >> <div></div>
# >> </head>
# >> <body>
# >>     <div>alert('pop this!')</div>
# >> </body>
# >> </html>

过滤掉脚本标签

2 个答案: