Question

我在Python中使用Requests库。在浏览器中，我的URL加载正常。在Python中，它会抛出403。

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>403 Forbidden</title>
</head><body>
<h1>Forbidden</h1>
<p>You don't have permission to access /admin/license.php on this server.</p>
<p>Additionally, a 404 Not Found error was encountered while trying to use an ErrorDocument to handle the request.</p>
</body></html>

这是我自己的网站，我知道没有任何机器人保护。我制作了我正在加载的PHP文件，它只是一个简单的数据库查询。在网站的根目录中，我有一个默认设置的WordPress网站。但是，我不确定这是否相关。

我的代码：

import requests
url = "myprivateurl.com"
r = requests.get(url)
print r.text

有没有人猜测为什么它会通过Python而不是浏览器抛出403？

非常感谢。

Answer 1

在联系我的网站主机并将故障单升级到2级支持后，他们禁用了mod_security，现在工作正常。不确定这是不是坏事，但是修复了它。

Answer 2

myprivateurl.com不是有效的网址。 Firefox经历了许多用户友好的行为来猜测你的实际意思，并且（在某种程度上取决于解析器结果等）最终会以http://myprivateurl.com/之类的方式结束。请求不会这样做;你必须传递一个真实有效的URL。

Answer 3

在请求中添加标头对我来说很有效：

req = urllib.request.Request(url)
req.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7')
response = urllib.request.urlopen(req)
data = response.read()      # a `bytes` object
html = data.decode('utf-8') # a `str`; this step can't be used if data is binary
return html

请求在浏览器中工作正常但在python中为403

3 个答案: