Question

在处理Python的请求库并通过函数输入URL时，我不断收到此错误：

某些字符无法解码，并被替换为更换字符。

我已将功能从另一个文件导入到运行我的代码的文件中。它返回URL字符串并导入，以便在requests.get()函数中使用。我已经尝试过弄乱函数，确保它正在导入（没有拼写错误等）。

#page_scrapper.py
import requests
import bs4
from classes import image_logic # This is where I import the function from my other file.

result = requests.get(image_logic()) # For some reason this is what's causing all the issues, it won't work with a function, only with a url, period.
c = result.content
soup = BeautifulSoup(c, 'html.parser')

#classes.py
import requests
import bs4

def image_logic():
    return "URL string here, obviously this won't be the actual string I have sitting here"

我希望result=requests.get(image_logic())会调用该函数并将返回值用作字符串，但是会不断抛出相同的错误。通过page_scrapper.py将URL打印到控制台可以正常工作。

任何其他提示将不胜感激。

Answer 1

某些字符无法解码，并被替换字符替换。

这很可能是因为您要使用result.content将网页作为字节接受，并且必须对其进行相应的解码，如果您不想解码网页，请尝试仅使用result.text。

Answer 2

我找到了答案。显然，我试图从站点上的纯jpg文件中抓取，该文件不包含通过链接提供的其他HTML。将正确的页面加载到image_logic（）后，问题已解决。

所有问题归结为混淆我的链接，并且两者看起来非常接近，以至于我一直错过最简单的解决方案。

Python：请求的网址不接受函数吗？

2 个答案: