如何让python循环遍历url数组并在csv中每行写入数据?

时间:2017-10-01 08:24:04

标签: python python-2.7

我有一组网址(库存数据),我希望将某些数据放入csv。每行我需要:

name price recrat opinion

出现csv但没有数据,我收到错误:

ValueError: too many values to unpack

我应该怎么做?到目前为止,这是我的代码:

# -*- coding: utf-8 -*-
import urllib2
from bs4 import BeautifulSoup
import csv
from datetime import datetime

quote_page = ['http://uk.mobile.reuters.com/business/quotes/overview/AALB.AS',
   'http://uk.mobile.reuters.com/business/stocks/overview/ABNd.AS',
   'http ://uk.mobile.reuters.com/business/stocks/overview/ACCG.AS', 
   'http ://uk.mobile.reuters.com/business/stocks/overview/AD.AS']


for link in quote_page:
    try:
        page = urllib2.urlopen(link)
        soup = BeautifulSoup(page, 'html.parser')

        name_box = soup.find('span', attrs={'class': 'company-name'})
        name = name_box.text.strip()
        print name

        price_box = soup.find('span', attrs={'class':'price'})
        price = price_box.text.strip()
        print price

        recrating_box = soup.find('div', attrs={'class':'recommendation-rating'})
        recrat = recrating_box.text.strip()
        print recrat

        opinion = soup.find('div', attrs={'class':'recommendation-marker'})['style']
        print opinion
    except TypeError:
        continue

quote_page.append((name, price, recrat, opinion))   
    # open a csv file with append, so old data will not be erased
with open('index.csv', 'a') as csv_file:
    writer = csv.writer(csv_file)
    for name, price in quote_page:
        writer.writerows([name, price, recrat, opinion, datetime.now()])

1 个答案:

答案 0 :(得分:1)

经过测试和工作:

# -*- coding: utf-8 -*-
import urllib2
from bs4 import BeautifulSoup
import csv
from datetime import datetime

quote_page = ['http://uk.mobile.reuters.com/business/quotes/overview/AALB.AS',
   'http://uk.mobile.reuters.com/business/stocks/overview/ABNd.AS',
   'http://uk.mobile.reuters.com/business/stocks/overview/ACCG.AS', 
   'http://uk.mobile.reuters.com/business/stocks/overview/AD.AS']

results = []

for link in quote_page:
    try:
        page = urllib2.urlopen(link)
        soup = BeautifulSoup(page, 'html.parser')

        name_box = soup.find('span', attrs={'class': 'company-name'})
        name = name_box.text.strip()
        print name

        price_box = soup.find('span', attrs={'class':'price'})
        price = price_box.text.strip()
        print price

        recrating_box = soup.find('div', attrs={'class':'recommendation-rating'})
        recrat = recrating_box.text.strip()
        print recrat

        opinion = soup.find('div', attrs={'class':'recommendation-marker'})['style']
        print opinion
    except TypeError:
        continue

    results.append((name, price, recrat, opinion))   

# open a csv file with append, so old data will not be erased
with open('index.csv', 'w') as csv_file:
    writer = csv.writer(csv_file)
    for item in results:
        writer.writerow([item[0], item[1], item[2], item[3], datetime.now()])

有3个问题,首先,你覆盖了一个活动列表 - 不是一个好主意:我将其重命名为results

其次,您试图迭代列表但只访问4个项目中的2个。我把它们做了索引。

最后,在您进行迭代时,您需要逐行执行此操作,因此writerows需要更改为writerow