Question

刚刚下载了Python 3.6（昨天）并期待学习该语言。对于我的第一个项目，我想构建一个Web scraper。我在网上看了几个例子，然后选择了一个例子。这是我正在使用的代码：

import csv
import requests
from BeautifulSoup import BeautifulSoup

url = 'http://www.showmeboone.com/sheriff/JailResidents/JailResidents.asp'
response = requests.get(url)
html = response.content

soup = BeautifulSoup(html)
table = soup.find('tbody', attrs={'class': 'stripe'})

list_of_rows = []
for row in table.findAll('tr')[1:]:
    list_of_cells = []
    for cell in row.findAll('td'):
        text = cell.text.replace('&nbsp;', '')
        list_of_cells.append(text)
    list_of_rows.append(list_of_cells)

outfile = open("./inmates.csv", "wb")
writer = csv.writer(outfile)
writer.writerow(["Last", "First", "Middle", "Gender", "Race", "Age", "City", "State"])
writer.writerows(list_of_rows)

但是，当我尝试运行时，收到以下错误：

ModuleNotFoundError: No module named 'BeautifulSoup'

我经历并安装了BeautifulSoup，但仍然没有。任何建议表示赞赏。

Answer 1

你需要

import bs4

不是

 import BeautifulSoup.

另外，您可以考虑以下格式：

import urllib.request
link = "some url"    

sauce = urllib.request.urlopen(link).read() #reads in the link
soup = bs4.BeautifulSoup(sauce,"lxml").get_text() #gets text from thr link
lines = soup.split("\n") #split by lines

Python新手，尝试构建Web scraper

1 个答案: