Question

网址为：https://boathistoryreport.com/directory/manufacturers/

我可以使用wget获取将显示在浏览器中的完整源代码。当我使用curl时，会收到不同的代码，该代码指定了重定向。但是，重定向必须使用相同的URL，因为在浏览器栏中它不会更改，而且curl响应中指定的链接也与请求的原始URL相同。

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>Redirecting...</title>
<h1>Redirecting...</h1>
<p>You should be redirected automatically to target URL: <a href="https://boathistoryreport.com/directory/manufacturers/">https://boathistoryreport.com/directory/manufacturers/</a>

此外，当我尝试在Python 3中使用urllib获取页面源时，出现308重定向错误，并且没有页面数据可用。这是python代码：

req = urllib.request.Request(
        url,
        data=None,
        headers={
            'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'
        }
    )
    context = ssl._create_unverified_context()
    try:
        page_data = urllib.request.urlopen(req, context=context).read()
        return page_data
    except urllib.error.HTTPError as e:
        # Return code error (e.g. 404, 501, ...)
        print('HTTPError: {}'.format(e.code))
        return False
    except urllib.error.URLError as e:
        # Not an HTTP-specific error (e.g. connection refused)
        print('URLError: {}'.format(e.reason))
        return False

我的问题是： 1）页面如何重定向到自身而不引起循环？或者，这是怎么回事？ 2）如何使用Python 3获取此资源？我相信我必须使用SSL，因为没有它就无法使用该页面。我宁愿使用urllib来获取页面。

Python urllib 308重定向

0 个答案: