Question

我正在使用Python进行乞讨，并尝试获取Google搜索的结果数量...

因此，我找到了一个不错的代码，使用了以下模块： re ， BeautifulSoup 和 urllib.request 。

此代码仅适用于普通字符，但是当我使用特殊字符（例如'é'，'à'等）时，它将失败。

我不知道该在哪里编码此网址，请有人帮我吗？

这是Python 3的代码：

    from bs4 import BeautifulSoup
    from urllib.request import Request, urlopen
    import re
    def get_result(search):
        search = "https://www.google.com/search?q={}".format(search.replace(" ", "%20"))
        req_google = Request(search)
        req_google.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB;    rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3')
        html_google = urlopen(req_google).read()
        soup = BeautifulSoup(html_google, "html.parser")
        scounttext = str(soup.find('div', id='resultStats'))
        scounttext = scounttext[41:60].replace(u'\xa0', "")
        num = re.findall('\d+', scounttext)
        return int(num[0])

    print(get_result("é"))

它返回此错误：

    UnicodeEncodeError: 'ascii' codec can't encode character '\xe9' in position 14: ordinal not in range(128)

计算此行时出现此错误：

    html_google = urlopen(req_google).read()

UnicodeEncodeError：“ ascii”编解码器在读取URL时无法对字符“ \ xe9”进行编码

0 个答案: