所以我成功使用了这个python脚本:
import httplib2
from BeautifulSoup import BeautifulSoup, SoupStrainer
http = httplib2.Http()
status, response = http.request('https://conceled:conceled@traveler.pha.phila.gov:8443/servlet/traveler')
for link in BeautifulSoup(response, parseOnlyThese=SoupStrainer('a')):
if link.has_key('href'):
print link['href']
从网站上取下链接。它适用于几乎任何其他网站,但在尝试上述(我需要工作的那个,我得到一些错误:)
Traceback (most recent call last):
File "C:\Users\joe\Desktop\PHA\AndroidPhones\androidphonescript2.py", line 5, in <module>
status, response = http.request('https://conceled@traveler.pha.phila.gov:8443/servlet/traveler')
File "C:\Python27\lib\httplib2.py", line 608, in request
(response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cacheFullPath)
File "C:\Python27\lib\httplib2.py", line 449, in _request
(response, content) = self._conn_request(conn, request_uri, method, body, headers)
File "C:\Python27\lib\httplib2.py", line 427, in _conn_request
conn.connect()
File "C:\Python27\lib\httplib.py", line 1157, in connect
self.timeout, self.source_address)
File "C:\Python27\lib\socket.py", line 553, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
gaierror: [Errno 11003] getaddrinfo failed
答案 0 :(得分:1)
该网站的证书无效,但这似乎不会导致问题。您使用的是什么版本的httplib2?我刚刚安装了当前版本0.7.7,我得到了更好的异常文本:
文件“d:\ Python27 \ lib \ site-packages \ httplib2-0.7.7-py2.7.egg \ httplib2__init __。py”,第1287行,在_conn_request中 引发ServerNotFoundError(“无法在%s找到服务器”%conn.host) ServerNotFoundError:无法在conceled上找到服务器:conceled@traveler.pha.phila.gov
因此它不会将//username:password@
解析为用户名和密码。 Httplib2 documentation表示凭证应通过以下方式提供:
Http.add_credentials(name, password[, domain=None])
所以试试:
http = httplib2.Http()
http.add_credentials(name, password)
status, response = http.request('https://traveler.pha.phila.gov:8443/servlet/traveler')
我在网站上没有帐户,因此无法测试。
如果您需要能够在URL中支持用户名和密码,则必须编写代码以自行解析。使用正则表达式(Python re模块)不应该太难。