我正在使用this stackoverflow帖子中的代码来取消隐藏网址...
import httplib
import urlparse
def unshorten_url(url):
parsed = urlparse.urlparse(url)
h = httplib.HTTPConnection(parsed.netloc)
resource = parsed.path
if parsed.query != "":
resource += "?" + parsed.query
h.request('HEAD', resource )
response = h.getresponse()
if response.status/100 == 3 and response.getheader('Location'):
return unshorten_url(response.getheader('Location')) # changed to process chains of short urls
else:
return url
对于新创建的bit.ly网址,所有缩短的链接都会被取消隐藏。
我收到此错误:
>>> unshorten_url("bit.ly/1atTViN")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 7, in unshorten_url
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 955, in request
self._send_request(method, url, body, headers)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 989, in _send_request
self.endheaders(body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 951, in endheaders
self._send_output(message_body)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 811, in _send_output
self.send(msg)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 773, in send
self.connect()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 754, in connect
self.timeout, self.source_address)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 571, in create_connection
raise err
socket.error: [Errno 61] Connection refused
是什么给出了?
答案 0 :(得分:3)
您忘记包含网址方案:
unshorten_url("http://bit.ly/1atTViN")
请注意那里的http://
,即重要。没有它,URL将无法正确解析:
>>> import urlparse
>>> urlparse.urlparse('bit.ly/1atTViN')
ParseResult(scheme='', netloc='', path='bit.ly/1atTViN', params='', query='', fragment='')
>>> urlparse.urlparse('http://bit.ly/1atTViN')
ParseResult(scheme='http', netloc='bit.ly', path='/1atTViN', params='', query='', fragment='')
如果不包含netloc
,请查看http://
参数如何为空;您最终尝试连接到自己的计算机,而您没有运行网络服务器,因此拒绝连接。
答案 1 :(得分:0)
可能bit.ly拒绝来自httplib等工具的连接。您可以尝试更改用户代理:
h.putheader('User-Agent','Mozilla/5.0 (X11; U; Linux i686; pl-PL; rv:1.7.10) Gecko/20050717 Firefox/1.0.6')