没有定义美丽的汤HTTPError错误的组合异常

时间:2013-11-15 17:50:37

标签: python beautifulsoup urllib2 http-error

我正在尝试编写代码,以便将股票代码数据废弃到csv文件中。但是,我收到以下错误。

Traceback (most recent call last):
  File "company_data_v3.py", line 23, in <module>
    page = urllib2.urlopen("http://finance.yahoo.com/q/ks?s="+newsymbolslist[i] +"%20Key%20Statistics").read()
  File "C:\Python27\lib\urllib2.py", line 127, in urlopen
    return _opener.open(url, data, timeout)
  File "C:\Python27\lib\urllib2.py", line 410, in open
    response = meth(req, response)
  File "C:\Python27\lib\urllib2.py", line 523, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Python27\lib\urllib2.py", line 448, in error
    return self._call_chain(*args)
  File "C:\Python27\lib\urllib2.py", line 382, in _call_chain
    result = func(*args)
  File "C:\Python27\lib\urllib2.py", line 531, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 400: Bad Request

我已经尝试了这个suggestion但是没有将urlib2 HTTPError导入程序的工作。 (这似乎是多余的,因为我已经导入了模块。

symbols.txt文件包含股票代码。这是我正在使用的代码:

import urllib2
from BeautifulSoup import BeautifulSoup
import csv
import re
import urllib
from urllib2 import HTTPError
# import modules

symbolfile = open("symbols.txt")
symbolslist = symbolfile.read()
newsymbolslist = symbolslist.split("\n")

i = 0

f = csv.writer(open("pe_ratio.csv","wb"))
# short cut to write

f.writerow(["Name","PE","Revenue % Quarterly","ROA% YOY","Operating Cashflow","Debt to Equity"])
#first write row statement

# define name_company as the following
while i<len(newsymbolslist):
    page = urllib2.urlopen("http://finance.yahoo.com/q/ks?s="+newsymbolslist[i] +"%20Key%20Statistics").read()
    soup = BeautifulSoup(page)
    name_company = soup.findAll("div", {"class" : "title"}) 
    for name in name_company: #add multiple iterations?        
        all_data = soup.findAll('td', "yfnc_tabledata1")
        stock_name = name.find('h2').string #find company's name in name_company with h2 tag
        try:    
            f.writerow([stock_name, all_data[2].getText(),all_data[17].getText(),all_data[13].getText(), all_data[29].getText(),all_data[26].getText()]) #write down PE data
        except (IndexError, urllib2.HTTPError) as e:
            pass
        i+=1    

我是否需要更具体地定义错误?谢谢你的帮助。

1 个答案:

答案 0 :(得分:1)

您正在错误的位置捕获异常。 urlopen()调用抛出异常,如回溯的第一行所示:

Traceback (most recent call last):
  File "company_data_v3.py", line 23, in <module>
    page = urllib2.urlopen("http://finance.yahoo.com/q/ks?s="+newsymbolslist[i] +"%20Key%20Statistics").read()

抓住它:

while i<len(newsymbolslist):
    try:
        page = urllib2.urlopen("http://finance.yahoo.com/q/ks?s="+newsymbolslist[i] +"%20Key%20Statistics").read()
    except urllib2.HTTPError:
        continue