我正在使用BeautifulSoup4
制作网页抓取工具并请求库。我在BeautifulSoup
工作时遇到了一些麻烦,但得到了一些帮助,并且能够解决这个问题。现在我遇到了一个新问题,我不知道如何解决它。我正在使用请求2.2.1,我正在尝试在Python 3.1.2上运行该程序。当我这样做时,我得到一个追溯错误。
这是我的代码:
from bs4 import BeautifulSoup
import requests
url = input("Enter a URL (start with www): ")
link = "http://" + url
page = requests.get(link).content
soup = BeautifulSoup(page)
for url in soup.find_all('a'):
print(url.get('href'))
print()
和错误:
Enter a URL (start with www): www.google.com
Traceback (most recent call last):
File "/Users/user/Desktop/project.py", line 8, in <module>
page = requests.get(link).content
File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/site-packages/requests-2.2.1-py3.1.egg/requests/api.py", line 55, in get
return request('get', url, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/site-packages/requests-2.2.1-py3.1.egg/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/site-packages/requests-2.2.1-py3.1.egg/requests/sessions.py", line 349, in request
prep = self.prepare_request(req)
File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/site-packages/requests-2.2.1-py3.1.egg/requests/sessions.py", line 287, in prepare_request
hooks=merge_hooks(request.hooks, self.hooks),
File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/site-packages/requests-2.2.1-py3.1.egg/requests/models.py", line 287, in prepare
self.prepare_url(url,params)
File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/site-packages/requests-2.2.1-py3.1.egg/requests/models.py", line 321, in prepare_url
url = str(url)
TypeError: 'tuple' object is not callable
我已经做了一些寻找,当其他人得到这个错误时(主要是在django中)有一个逗号丢失,但我不确定在哪里放一个逗号?任何帮助将不胜感激。