这在开发环境中运行良好:
url = "http://www.google.com/"
return urllib2.urlopen(url)
但是当我将其上传到谷歌应用引擎并在那里运行时,我收到以下错误:
return urllib2.urlopen(url)
File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/urllib2.py", line 127, in urlopen
return _opener.open(url, data, timeout)
File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/urllib2.py", line 404, in open
response = self._open(req, data)
File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/urllib2.py", line 422, in _open
'_open', req)
File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/urllib2.py", line 382, in _call_chain
result = func(*args)
File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/urllib2.py", line 1214, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/base/data/home/runtimes/python27/python27_dist/lib/python2.7/urllib2.py", line 1184, in do_open
raise URLError(err)
URLError: <urlopen error [Errno 13] Permission denied>
任何人都知道为什么会这样吗?非常感谢!
答案 0 :(得分:0)
这可能是Google实施的机器人或DDoS保护功能。谷歌可能不希望人们在自己的GAE上托管服务,然后加载他们的主页。曾经或许是无辜的,但想象一下你的urlopen是无限循环。然后,谷歌自己的基础设施可用于攻击它自己的主页。解释为什么(大多数)其他URL工作。
通过来自Google Cloud相关IP的urlopen请求提交的某些网址(例如:亚马逊产品页面)将为403。这是由亚马逊实施的,以防止亚马逊上的机器人,内容抓取,DoS等。同样的原则。