Question

我最近一直在提高我的网络服务器的安全性，我使用http.server和BaseHTTPRequestHandler编写了自己。我已阻止（403'd）最重要的服务器文件，我不希望用户能够访问。文件包括python服务器脚本和所有数据库，以及一些HTML模板。

但是，在stackoverflow上的this帖子中，我读到在do_GET请求中使用open(curdir + sep + self.path)可能会使计算机上的每个文件都可读。谁可以给我解释一下这个？如果self.path每次都是ip:port/index.html，那么有人如何访问位于根/目录之上的文件？

我理解用户（显然）可以将index.html更改为其他任何内容，但我看不到他们如何访问root以上的目录。

另外，如果您想知道我为什么不使用nginx或apache，我想创建自己的网络服务器和网站以供学习。我不打算自己运行一个真实的网站，如果我愿意，我可能会租用服务器或使用现有的服务器软件。

class Handler(http.server.BaseHTTPRequestHandler):

    def do_GET(self):

        try:
            if "SOME BLOCKED FILE OR DIRECTORY" in self.path:
                self.send_error(403, "FORBIDDEN")
                return
            #I have about 6 more of these 403 parts, but I left them out for readability
            if self.path.endswith(".html"):
                if self.path.endswith("index.html"):
                    #template is the Template Engine that I created to create dynamic HTML content
                    parser = template.TemplateEngine()
                    content = parser.get_content("index", False, "None", False)
                    self.send_response(200)
                    self.send_header("Content-type", "text/html")
                    self.end_headers()
                    self.wfile.write(content.encode("utf-8"))
                    return
                elif self.path.endswith("auth.html"):
                    parser = template.TemplateEngine()
                    content = parser.get_content("auth", False, "None", False)
                    self.send_response(200)
                    self.send_header("Content-type", "text/html")
                    self.end_headers()
                    self.wfile.write(content.encode("utf-8"))
                    return
                elif self.path.endswith("about.html"):
                    parser = template.TemplateEngine()
                    content = parser.get_content("about", False, "None", False)
                    self.send_response(200)
                    self.send_header("Content-type", "text/html")
                    self.end_headers()
                    self.wfile.write(content.encode("utf-8"))
                    return
                else:
                    try:
                        f = open(curdir + sep + self.path, "rb")
                        self.send_response(200)
                        self.send_header("Content-type", "text/html")
                        self.end_headers()
                        self.wfile.write((f.read()))
                        f.close()
                        return
                    except IOError as e:
                        self.send_response(404)
                        self.send_header("Content-type", "text/html")
                        self.end_headers()
                        return
            else:
                if self.path.endswith(".css"):
                    h1 = "Content-type" 
                    h2 = "text/css"
                elif self.path.endswith(".gif"):
                    h1 = "Content-type"
                    h2 = "gif"
                elif self.path.endswith(".jpg"):
                    h1 = "Content-type"
                    h2 = "jpg"
                elif self.path.endswith(".png"):
                    h1 = "Content-type"
                    h2 = "png"
                elif self.path.endswith(".ico"):
                    h1 = "Content-type"
                    h2 = "ico"
                elif self.path.endswith(".py"):
                    h1 = "Content-type"
                    h2 = "text/py"
                elif self.path.endswith(".js"):
                    h1 = "Content-type"
                    h2 = "application/javascript"
                else:
                    h1 = "Content-type"
                    h2 = "text"
                f = open(curdir+ sep + self.path, "rb")
                self.send_response(200)
                self.send_header(h1, h2)
                self.end_headers()
                self.wfile.write(f.read())
                f.close()
                return
        except IOError:
            if "html_form_action.asp" in self.path:
                pass
            else:
                self.send_error(404, "File not found: %s" % self.path)
        except Exception as e:
            self.send_error(500)
            print("Unknown exception in do_GET: %s" % e)

Answer 1

你的假设是无效的：

如果self.path每次都是ip:port/index.html，那么有人如何访问根/目录上方的文件？

但self.path 从不 ip:port/index.html。尝试记录它，看看你得到了什么。

例如，如果我请求http://example.com:8080/foo/bar/index.html，则self.path不是example.com:8080/foo/bar/index.html，而只是/foo/bar/index.html。实际上，您的代码无法正常工作，因为curdir+ sep + self.path会为您提供一条以./example.com:8080/开头的路径，该路径将不存在。

然后问问自己，如果它是/../../../../../../../etc/passwd会发生什么。

这是使用os.path而不是路径字符串操作的众多原因之一。例如，而不是：

f = open(curdir + sep + self.path, "rb")

这样做：

path = os.path.abspath(os.path.join(curdir, self.path))
if os.path.commonprefix((path, curdir)) != curdir:
    # illegal!

我假设curdir这里是绝对路径，而不仅仅是from os import curdir或其他更有可能给你.的东西。如果是后者，请确保abspath也是如此。

这可以捕捉其他逃脱监狱的方式以及传递..字符串......但它不会捕捉到一切。例如，如果有一个符号链接指向jail，那么abspath无法告诉某人已通过符号链接。

Answer 2

self.path包含请求路径。如果我要发送GET请求并询问位于/../../../../../../../etc/passwd的资源，我会中断您应用程序的当前文件夹并能够访问您文件系统上的任何文件（您有权访问读取）。

Python33 - 使用BaseHTTPRequestHandler提高服务器安全性

2 个答案: