我有一组网址(库存数据),我希望将某些数据放入csv。每行我需要:
name price recrat opinion
出现csv但没有数据,我收到错误:
ValueError: too many values to unpack
我应该怎么做?到目前为止,这是我的代码:
# -*- coding: utf-8 -*-
import urllib2
from bs4 import BeautifulSoup
import csv
from datetime import datetime
quote_page = ['http://uk.mobile.reuters.com/business/quotes/overview/AALB.AS',
'http://uk.mobile.reuters.com/business/stocks/overview/ABNd.AS',
'http ://uk.mobile.reuters.com/business/stocks/overview/ACCG.AS',
'http ://uk.mobile.reuters.com/business/stocks/overview/AD.AS']
for link in quote_page:
try:
page = urllib2.urlopen(link)
soup = BeautifulSoup(page, 'html.parser')
name_box = soup.find('span', attrs={'class': 'company-name'})
name = name_box.text.strip()
print name
price_box = soup.find('span', attrs={'class':'price'})
price = price_box.text.strip()
print price
recrating_box = soup.find('div', attrs={'class':'recommendation-rating'})
recrat = recrating_box.text.strip()
print recrat
opinion = soup.find('div', attrs={'class':'recommendation-marker'})['style']
print opinion
except TypeError:
continue
quote_page.append((name, price, recrat, opinion))
# open a csv file with append, so old data will not be erased
with open('index.csv', 'a') as csv_file:
writer = csv.writer(csv_file)
for name, price in quote_page:
writer.writerows([name, price, recrat, opinion, datetime.now()])
答案 0 :(得分:1)
经过测试和工作:
# -*- coding: utf-8 -*-
import urllib2
from bs4 import BeautifulSoup
import csv
from datetime import datetime
quote_page = ['http://uk.mobile.reuters.com/business/quotes/overview/AALB.AS',
'http://uk.mobile.reuters.com/business/stocks/overview/ABNd.AS',
'http://uk.mobile.reuters.com/business/stocks/overview/ACCG.AS',
'http://uk.mobile.reuters.com/business/stocks/overview/AD.AS']
results = []
for link in quote_page:
try:
page = urllib2.urlopen(link)
soup = BeautifulSoup(page, 'html.parser')
name_box = soup.find('span', attrs={'class': 'company-name'})
name = name_box.text.strip()
print name
price_box = soup.find('span', attrs={'class':'price'})
price = price_box.text.strip()
print price
recrating_box = soup.find('div', attrs={'class':'recommendation-rating'})
recrat = recrating_box.text.strip()
print recrat
opinion = soup.find('div', attrs={'class':'recommendation-marker'})['style']
print opinion
except TypeError:
continue
results.append((name, price, recrat, opinion))
# open a csv file with append, so old data will not be erased
with open('index.csv', 'w') as csv_file:
writer = csv.writer(csv_file)
for item in results:
writer.writerow([item[0], item[1], item[2], item[3], datetime.now()])
有3个问题,首先,你覆盖了一个活动列表 - 不是一个好主意:我将其重命名为results
。
其次,您试图迭代列表但只访问4个项目中的2个。我把它们做了索引。
最后,在您进行迭代时,您需要逐行执行此操作,因此writerows
需要更改为writerow
。