Question

我正在尝试使用bs4模块解析网页中的某个文本。使用请求模块下载文件，但是当我尝试针对内容运行bs4模块但是它会抛出异常＆＃39;下载页面时出错＆＃39;。下载文件的status_code是200.不知道我在这里缺少什么。我已经安装了lxml模块以避免bs4模块抛出警告。

try:
    import requests, bs4
    import os
    import ntpath as path
    import lxml
    res = requests.get('http://nostarch.com/',stream=True)
    print(res.status_code, len(res.text))
    if res.status_code == 200:
        bs = bs4.BeautifulSoup(res.text, 'lxml')
        out = bs.select('p')
        print('the no of elements matched :', len(out))
        for i in range(len(out)):
            print(i,'th Text is :-', out[i].getText())

except Exception as err:
    print('error downloading page', res.raise_for_status())

输出： -

200 51431

错误下载页面无

使用bs4模块解析输出时引发异常

0 个答案: