我正在尝试通过参数在Google中进行搜索,当我搜索一个单词时,它是有效的,但是我确实将其打乱了空格,我知道可以对网址进行编码。
import urllib.request
from urllib.parse import urlencode, quote_plus
from fake_useragent import UserAgent
import time
import requests
from bs4 import BeautifulSoup
keyword = "host free"
url = "https://www.google.co.il/search?q=%s" % (keyword)
print(url)
thepage = urllib.request.Request(url, headers=request_headers)
page = urllib.request.urlopen(thepage)
//Continue...
跟踪:
https://www.google.co.il/search?q=host free
Traceback (most recent call last):
File "C:\Users\Maor Ben Lulu\Desktop\Maor\Python\google\Google_Bot_new.py", line 42, in <module>
page = urllib.request.urlopen(thepage)
File "C:\Program Files (x86)\Python37-32\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Program Files (x86)\Python37-32\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "C:\Program Files (x86)\Python37-32\lib\urllib\request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Program Files (x86)\Python37-32\lib\urllib\request.py", line 569, in error
return self._call_chain(*args)
File "C:\Program Files (x86)\Python37-32\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Program Files (x86)\Python37-32\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request
[Finished in 0.7s with exit code 1]
[shell_cmd: python -u "C:\Users\Maor Ben Lulu\Desktop\Maor\Python\google\Google_Bot_new.py"]
[dir: C:\Users\Maor Ben Lulu\Desktop\Maor\Python\google]
[path: C:\Program Files (x86)\Python37-32\Scripts\;C:\Program Files (x86)\Python37-32\;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Windows\System32\OpenSSH\;D:\Program Files\Git\cmd;C:\Users\Maor Ben Lulu\AppData\Local\Microsoft\WindowsApps;]
还有一次,我写了希伯来语:
UnicodeEncodeError:“ ascii”编解码器无法对位置14-18中的字符进行编码:序数不在range(128)中
答案 0 :(得分:1)
有一种方法可以使用urllib.parse.quote对网址进行编码 但是有requests模块在这种情况下非常有用,您可以按以下方式使用它:
import requests
base_url = 'https://www.google.co.il/search'
res = requests.get(base_url, params={'q': 'host free'}) # query parameter and value in dict format to be passed as params kwarg
如上所示,您可以将查询参数作为关键字参数传递