python urllib2中的代理和URL

时间:2013-11-15 08:04:12

标签: python python-2.7 proxy urllib2

这是我的代码,但它给了我一些错误,我无法解决它。即使相同的代码使用单个URL和单个代理运行正常,但它没有运行代理和URL的文件。

import urllib2
import time 
#bangalore, boston,china

with open('urls.txt') as f:
    urls = [line.strip() for line in f]
    print "list of urls",urls
with open('proxies.txt') as proxies:
    for proxy in proxies:
        print proxy
        proxy = proxy.rstrip()
        print proxy
        proxy_handler = urllib2.ProxyHandler(proxy)
        opener = urllib2.build_opener(proxy_handler)
        urllib2.install_opener(opener)
        try:
            for url in urls:
                request=urllib2.Request(url)
                start=time.time()
                try:
                    print "from try block"
                    response=urllib2.urlopen(urls[0])
                    response.read(1)
                    ttfb = time.time() - start
                    print "Latency:", ttfb
                    print "Status Code:", response.code
                    print "Headers:", response.headers
                    print "Redirected url:", response.url  
                except urllib2.URLError as e:
                    print "From except"
                    print "Error Reason:", e.reason
                    print "Error Message:", e.message
                   # print "Redirected URL:", e.url
                except urllib2.HTTPError as e:
                    print e.reason 
        except Exception,e:
            print e

1 个答案:

答案 0 :(得分:0)

替换为:

proxy = json.loads(proxy.rstrip())

(并导入json)

urls.txt行如:

  

http://www.google.com

proxies.txt行就像:

  

{"http" : "http://ip:port"}

根据我对你帖子的评论,这也总是引用第一个网址:

response=urllib2.urlopen(urls[0])