如果请求超时跳过url python请求

时间:2016-07-16 14:03:05

标签: python timeout python-requests

我希望以下脚本尝试url_list中的每个url,如果它存在则打印存在(url)如果不打印则不存在(url)并且如果请求超时则使用“requests”lib跳转到下一个url :

url_list = ['www.google.com','www.urlthatwilltimeout.com','www.urlthatdon\'t exist']

def exist:
    if request.status_code == 200:
    print"exist{0}".format(url)
else:
    print"don\'t{0}".format(url)
a = 0

while (a < 2):
url = urllist[a]
try:
    request = requests.get(url, timeout=10)
except request.timeout:#any option that is similar?
    print"timed out"
    continue
validate()
a+=1

1 个答案:

答案 0 :(得分:0)

基于this SO answer 下面是代码,它将限制GET请求所花费的总时间 识别可能发生的其他例外。

请注意,在请求2.4.0及更高版本中,您可以指定连接超时和读取超时 使用语法:

requests.get(..., timeout=(...conn timeout..., ...read timeout...))

但是,读取超时仅指定个体之间的超时 读取调用,而不是整个请求的超时。

代码:

import requests
import eventlet
eventlet.monkey_patch()

url_list = ['http://localhost:3000/delay/0',
            'http://localhost:3000/delay/20', 
            'http://localhost:3333/',         # no server listening
            'http://www.google.com'
           ]

for url in url_list:
     try:
        with eventlet.timeout.Timeout(1):
          response = requests.get(url)
        print "OK -", url
     except requests.exceptions.ReadTimeout:
        print "READ TIMED OUT -", url
     except requests.exceptions.ConnectionError:
        print "CONNECT ERROR -", url
     except eventlet.timeout.Timeout, e:
        print "TOTAL TIMEOUT -", url
     except requests.exceptions.RequestException, e:
        print "OTHER REQUESTS EXCEPTION -", url, e

这是一个可以用来测试它的快速服务器:

var express = require('express');
var sleep = require('sleep')
var app = express();

app.get('/delay/:secs', function(req, res) {
  var secs = parseInt( req.params.secs )
  sleep.sleep(secs)
  res.send('Done sleeping for ' + secs + ' seconds')  
});

app.listen(3000, function () {
  console.log('Example app listening on port 3000!');
});