Python新手,尝试构建Web scraper

时间:2017-09-07 17:36:37

标签: python csv

刚刚下载了Python 3.6(昨天)并期待学习该语言。对于我的第一个项目,我想构建一个Web scraper。我在网上看了几个例子,然后选择了一个例子。这是我正在使用的代码:

import csv
import requests
from BeautifulSoup import BeautifulSoup

url = 'http://www.showmeboone.com/sheriff/JailResidents/JailResidents.asp'
response = requests.get(url)
html = response.content

soup = BeautifulSoup(html)
table = soup.find('tbody', attrs={'class': 'stripe'})

list_of_rows = []
for row in table.findAll('tr')[1:]:
    list_of_cells = []
    for cell in row.findAll('td'):
        text = cell.text.replace(' ', '')
        list_of_cells.append(text)
    list_of_rows.append(list_of_cells)

outfile = open("./inmates.csv", "wb")
writer = csv.writer(outfile)
writer.writerow(["Last", "First", "Middle", "Gender", "Race", "Age", "City", "State"])
writer.writerows(list_of_rows)

但是,当我尝试运行时,收到以下错误:

ModuleNotFoundError: No module named 'BeautifulSoup'

我经历并安装了BeautifulSoup,但仍然没有。任何建议表示赞赏。

1 个答案:

答案 0 :(得分:0)

你需要

import bs4 

不是

 import BeautifulSoup. 

另外,您可以考虑以下格式:

import urllib.request
link = "some url"    

sauce = urllib.request.urlopen(link).read() #reads in the link
soup = bs4.BeautifulSoup(sauce,"lxml").get_text() #gets text from thr link
lines = soup.split("\n") #split by lines