使用BeautifulSoup将抓取的数据移动到csv

时间:2014-04-06 14:00:45

标签: python html csv html-parsing beautifulsoup

将您使用BeautifulSoup抓取的数据移动到CSV文件中似乎至关重要。我接近成功,但不知何故,CSV文件中的每一列都是来自刮取信息的一个字母,并且它只移动了最后一项刮擦。

这是我的代码:

import urllib2
import csv
from bs4 import BeautifulSoup
url = "http://www.chicagoreader.com/chicago/BestOf?category=4053660&year=2013"
page = urllib2.urlopen(url)
soup_package = BeautifulSoup(page)
page.close()

#find everything in the div class="bestOfItem). This works.
all_categories = soup_package.findAll("div",class_="bestOfItem")
print(winner_category) #print out all winner categories to see if working

#grab just the text in a tag:
for match_categories in all_categories:
    winner_category = match_categories.a.string

#Move to csv file:
f = file("file.csv", 'a')
csv_writer = csv.writer(f)
csv_writer.writerow(winner_category)
print("Check your dropbox for file")

2 个答案:

答案 0 :(得分:0)

将#Move移动到csv文件:部分在For循环中。

此外,似乎你还在for循环中覆盖winner_category。采取其他变量可能是一个更好的主意。

像(未经测试的)应该有所帮助

#grab just the text in a tag:
f = file("file.csv", 'a')

for match_categories in all_categories:
    fwinner = match_categories.a.string

    #Move to csv file:
    csv_writer = csv.writer(f)
    csv_writer.writerow(fwinner)
    print("Check your dropbox for file")
f.close()

答案 1 :(得分:0)

问题是writerow()期望迭代。在您的情况下,它接收一个字符串并将其拆分为单个字符。将每个值放入列表中。

此外,您需要在循环中执行此操作。

此外,您可以将urllib2.urlopen(url)直接传递给BeautifulSoup构造函数。

此外,您在处理文件时应使用with上下文管理器。

以下是修改后的代码:

import urllib2
import csv
from bs4 import BeautifulSoup


url = "http://www.chicagoreader.com/chicago/BestOf?category=4053660&year=2013"
soup_package = BeautifulSoup(urllib2.urlopen(url))
all_categories = soup_package.find_all("div", class_="bestOfItem")

with open("file.csv", 'w') as f:
    csv_writer = csv.writer(f)
    for match_categories in all_categories:
        value = match_categories.a.string
        if value:
            csv_writer.writerow([value.encode('utf-8')])

运行脚本后file.csv的内容是:

Best View From a Performance Space
Best Amateur Hip-Hop Dancer Who's Also a Professional Wrestler
Best Dance Venue in New Digs
Best Outré Dance
Best (and Most Vocal) Mime
Best Performance in a Fat Suit
Best Theatrical Use of Unruly Facial Hair
...

此外,我不确定您是否需要csv模块。