Python urllib2.urlopen返回HTTP错误503

时间:2014-12-08 09:28:38

标签: python urllib2 urlopen

在这里,您可以看到我的代码段。从3天开始它不再起作用了。 我的python在Ubuntu 10.04.4 LTS下运行。 Python版本是2.6.5。

#!/usr/bin/env python
import urllib2 as ur
...
webpage = []

site = "http://www.gametracker.com/server_info/94.250.218.247:25200/top_players/"
hdr =  {'User-Agent': 'Mozilla/5.0'}
req = ur.Request(site , headers=hdr)
data = ur.urlopen(req)
for line in data:
    line = line.split(",")
    webpage.append(line)
...

这里返回的Error-msg

Traceback (most recent call last):

File "read_top5.py", line 21, in <module>
  data = ur.urlopen(req)
File "/usr/lib/python2.6/urllib2.py", line 126, in urlopen
  return _opener.open(url, data, timeout)
File "/usr/lib/python2.6/urllib2.py", line 397, in open
  response = meth(req, response)
File "/usr/lib/python2.6/urllib2.py", line 510, in http_response
  'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.6/urllib2.py", line 435, in error
  return self._call_chain(*args)
File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
  result = func(*args)
File "/usr/lib/python2.6/urllib2.py", line 518, in http_error_default
  raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 503: Service Temporarily Unavailable

2 个答案:

答案 0 :(得分:5)

该服务目前无法使用。 curl

curl -i "http://www.gametracker.com/server_info/94.250.218.247:25200/top_players/"

还会返回503:

HTTP/1.1 503 Service Temporarily Unavailable
Date: Mon, 08 Dec 2014 09:37:17 GMT
Content-Type: text/html; charset=UTF-8
Server: cloudflare-nginx

该服务正在使用CloudFlare,它提供form of DDoS protection,要求您使用完整的网络浏览器进行连接。

虽然您可能会解决这个问题,但通过决定使用此服务,网站运营商声明他们不希望您使用脚本进行连接。

这不是编程问题;您需要确定脚本无法使用该服务的原因。

答案 1 :(得分:1)

这只是网站所做的事情。它似乎是某种反DDoS系统的一部分。为什么它返回503是令人困惑的,但它绝对是网站本身。

我尝试了上面的curl命令Joe,这是我得到的回复:

HTTP/1.1 503 Service Temporarily Unavailable
Date: Mon, 08 Dec 2014 09:47:41 GMT
Content-Type: text/html; charset=UTF-8
Transfer-Encoding: chunked
Connection: keep-alive
Set-Cookie: __cfduid=d32f001037fafc1363bf86d29be0baf921418032061; expires=Tue, 08-Dec-15 09:47:41 GMT; path=/; domain=.gametracker.com; HttpOnly
X-Frame-Options: SAMEORIGIN
Cache-Control: no-cache
Server: cloudflare-nginx
CF-RAY: 19580b02d7c70f21-IAD

<!DOCTYPE HTML>
<html lang="en-US">
<head>
  <meta charset="UTF-8" />
  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
  <meta http-equiv="X-UA-Compatible" content="IE=Edge,chrome=1" />
  <meta name="robots" content="noindex, nofollow" />
  <meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1" />
  <title>Just a moment...</title>
  <style type="text/css">
    html, body {width: 100%; height: 100%; margin: 0; padding: 0;}
    body {background-color: #ffffff; font-family: Helvetica, Arial, sans-serif; font-size: 100%;}
    h1 {font-size: 1.5em; color: #404040; text-align: center;}
    p {font-size: 1em; color: #404040; text-align: center; margin: 10px 0 0 0;}
    #spinner {margin: 0 auto 30px auto; display: block;}
    .attribution {margin-top: 20px;}
  </style>

    <script type="text/javascript">
  //<![CDATA[
  (function(){
    var a = function() {try{return !!window.addEventListener} catch(e) {return !1} },
    b = function(b, c) {a() ? document.addEventListener("DOMContentLoaded", b, c) : document.attachEvent("onreadystatechange", b)};
    b(function(){
      var a = document.getElementById('cf-content');a.style.display = 'block';
      setTimeout(function(){
        var t,r,a,f, sdDUenl={"xRvHG":+((!+[]+!![]+[])+(!+[]+!![]+!![]+!![]+!![]+!![]+!![]+!![]))};
        t = document.createElement('div');
        t.innerHTML="<a href='/'>x</a>";
        t = t.firstChild.href;r = t.match(/https?:\/\//)[0];
        t = t.substr(r.length); t = t.substr(0,t.length-1);
        a = document.getElementById('jschl-answer');
        f = document.getElementById('challenge-form');
        ;sdDUenl.xRvHG*=+((!+[]+!![]+!![]+[])+(!+[]+!![]+!![]+!![]));sdDUenl.xRvHG-=+((!+[]+!![]+[])+(!+[]+!![]+!![]+!![]+!![]));sdDUenl.xRvHG+=+((!+[]+!![]+!![]+[])+(!+[]+!![]+!![]+!![]));sdDUenl.xRvHG*=+((!+[]+!![]+!![]+!![]+[])+(!+[]+!![]+!![]+!![]+!![]+!![]));sdDUenl.xRvHG-=+((!+[]+!![]+!![]+!![]+!![]+[])+(+[]));sdDUenl.xRvHG-=+((!+[]+!![]+[])+(!+[]+!![]+!![]+!![]+!![]));sdDUenl.xRvHG*=+((!+[]+!![]+[])+(!+[]+!![]+!![]+!![]));sdDUenl.xRvHG-=+((!+[]+!![]+!![]+!![]+[])+(!+[]+!![]+!![]));sdDUenl.xRvHG*=+((+!![]+[])+(!+[]+!![]+!![]+!![]+!![]+!![]+!![]));sdDUenl.xRvHG+=+((!+[]+!![]+!![]+!![]+[])+(!+[]+!![]+!![]+!![]+!![]+!![]+!![]+!![]));a.value = parseInt(sdDUenl.xRvHG, 10) + t.length;
        f.submit();
      }, 5850);
    }, false);
  })();
  //]]>
</script>


</head>
<body>
  <table width="100%" height="100%" cellpadding="20">
    <tr>
      <td align="center" valign="middle">
          <div class="cf-browser-verification cf-im-under-attack">
  <noscript><h1 data-translate="turn_on_js" style="color:#bd2426;">Please turn JavaScript on and reload the page.</h1></noscript>
  <div id="cf-content" style="display:none">
    <img id="spinner" src="/cdn-cgi/images/spinner-2013.gif" />
    <h1><span data-translate="checking_browser">Checking your browser before accessing</span> gametracker.com.</h1>
    <p data-translate="process_is_automatic">This process is automatic. Your browser will redirect to your requested content shortly.</p>
    <p data-translate="allow_5_secs">Please allow up to 5 seconds&hellip;</p>
  </div>
  <form id="challenge-form" action="/cdn-cgi/l/chk_jschl" method="get">
    <input type="hidden" name="jschl_vc" value="3cecd7cab5d69708a3b1081e462824d0"/>
    <input type="hidden" id="jschl-answer" name="jschl_answer"/>
  </form>
</div>


          <div class="attribution"><a href="http://www.cloudflare.com/" target="_blank" style="font-size: 12px;">DDoS protection by CloudFlare</a></div>
      </td>
    </tr>
  </table>
</body>
</html>

请注意,正文包含内容,尽管它是503状态代码。这实际上与我在浏览器中访问页面时看到的一致。首先,我被发送到您在上面的响应中看到的这个“反DDoS”页面,然后我被自动重定向到URL中请求的页面(显然是通过JavaScript)。这解释了为什么它在浏览器之外的行为不如预期; Python Web请求不会执行JavaScript来执行重定向。

所以这绝对是服务。你必须咨询制作它的人,找出他们期望你处理它的原因和方式。您可能希望查看它们是否具有不同的API调用端点,或者如果设置Accept标头,端点可能会有不同的响应。 (application/json可用于表示您希望返回JSON。)