python请求对URL进行编码

时间:2018-08-29 16:49:26

标签: python python-requests urllib

我正在尝试通过参数在Google中进行搜索,当我搜索一个单词时,它是有效的,但是我确实将其打乱了空格,我知道可以对网址进行编码。

import urllib.request
from urllib.parse import urlencode, quote_plus
from fake_useragent import UserAgent
import time
import requests
from bs4 import BeautifulSoup

keyword = "host free"
url = "https://www.google.co.il/search?q=%s" % (keyword)
print(url)

thepage = urllib.request.Request(url, headers=request_headers)
page = urllib.request.urlopen(thepage)

//Continue...

跟踪:

https://www.google.co.il/search?q=host free
Traceback (most recent call last):
  File "C:\Users\Maor Ben Lulu\Desktop\Maor\Python\google\Google_Bot_new.py", line 42, in <module>
    page = urllib.request.urlopen(thepage)
  File "C:\Program Files (x86)\Python37-32\lib\urllib\request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Program Files (x86)\Python37-32\lib\urllib\request.py", line 531, in open
    response = meth(req, response)
  File "C:\Program Files (x86)\Python37-32\lib\urllib\request.py", line 641, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Program Files (x86)\Python37-32\lib\urllib\request.py", line 569, in error
    return self._call_chain(*args)
  File "C:\Program Files (x86)\Python37-32\lib\urllib\request.py", line 503, in _call_chain
    result = func(*args)
  File "C:\Program Files (x86)\Python37-32\lib\urllib\request.py", line 649, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request
[Finished in 0.7s with exit code 1]
[shell_cmd: python -u "C:\Users\Maor Ben Lulu\Desktop\Maor\Python\google\Google_Bot_new.py"]
[dir: C:\Users\Maor Ben Lulu\Desktop\Maor\Python\google]
[path: C:\Program Files (x86)\Python37-32\Scripts\;C:\Program Files (x86)\Python37-32\;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Windows\System32\OpenSSH\;D:\Program Files\Git\cmd;C:\Users\Maor Ben Lulu\AppData\Local\Microsoft\WindowsApps;]

还有一次,我写了希伯来语:

  

UnicodeEncodeError:“ ascii”编解码器无法对位置14-18中的字符进行编码:序数不在range(128)中

1 个答案:

答案 0 :(得分:1)

有一种方法可以使用urllib.parse.quote对网址进行编码 但是有requests模块在​​这种情况下非常有用,您可以按以下方式使用它:

import requests
base_url = 'https://www.google.co.il/search'
res = requests.get(base_url, params={'q': 'host free'})  # query parameter and value in dict format to be passed as params kwarg

如上所示,您可以将查询参数作为关键字参数传递