为什么不是BeautifulSoup工作? (Python 2.7.10)

时间:2017-03-04 23:09:18

标签: python beautifulsoup

from bs4 import BeautifulSoup 
import urllib.request
r = urllib.request.urlopen('http://www.aflcio.org/Legislation-and-Politics/Legislative-Alerts').read()
soup = BeautifulSoup(r)
print type(soup)

我收到消息“urllib.error.HTTPError:HTTP Error 403:Forbidden”

在模块方面,我是一个完全的初学者,所以我不知道我在做什么。遗憾。

1 个答案:

答案 0 :(得分:0)

您可能想要指定UserAgent:

import requests
from bs4 import BeautifulSoup

ret = requests.request(
    'GET',
    'http://www.aflcio.org/Legislation-and-Politics/Legislative-Alerts',
    headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/602.4.8 (KHTML, like Gecko) Version/10.0.3 Safari/602.4.8'}
)

soup = BeautifulSoup(ret.text, "html.parser")
print type(soup)