from urllib import urlopen
from bs4 import BeautifulSoup
import re
# Copy all of the content from the provided web page
webpage = urlopen('http://stats.espncricinfo.com/indian-premier-league-2012/engine/records/averages/batting.html?id=6680;type=tournament').read()
soup=BeautifulSoup(webpage);
commentary=soup.find_all("tr", "data2");
for i in range(10):
for stat in commentary[i].stripped_strings:
print stat,
print ""
我在eclipse中运行这个python程序。我在网络连接中更改了我的代理条目。但我得到的IOError如下:
IOError:[Errno套接字错误] [Errno -2]名称或服务未知
追踪(最近一次呼叫最后一次):
文件“/home/sumanth/workspace/python/scraping.py”,第22行,in webpage = urlopen('http://stats.espncricinfo.com/indian-premier-league-2012/engine/records/averages/batting.html?id=6680;type=tournament')。read()
文件“/usr/lib/python2.7/urllib.py”,第86行,在urlopen中 return opener.open(url)
文件“/usr/lib/python2.7/urllib.py”,第207行,处于打开状态 return getattr(self,name)(url)
文件“/usr/lib/python2.7/urllib.py”,第344行,在open_http中 h.endheaders(数据)
文件“/usr/lib/python2.7/httplib.py”,第958行,在endheaders中 self._send_output(MESSAGE_BODY)
文件“/usr/lib/python2.7/httplib.py”,第818行,在_send_output中 self.send(MSG)
文件“/usr/lib/python2.7/httplib.py”,第780行,发送 self.connect()
文件“/usr/lib/python2.7/httplib.py”,第761行,在连接中 self.timeout,self.source_address)
文件“/usr/lib/python2.7/socket.py”,第571行,在create_connection中 提出错误
IOError:[Errno套接字错误] [Errno 110]连接超时
答案 0 :(得分:1)
看起来你有一个很好的互联网连接。错误“名称或服务未知”表示页面的DNS查找失败,“连接超时错误”表示您无法联系远程服务器但DNS查找成功。