Question

我正在尝试制作一个脚本，使用urllib2在Python中自动执行文件获取。我通过NTLM进行了身份验证，并尝试从生成变量字符串unicode下载链接的页面下载加密/压缩文件。脚本的一部分被编辑，因为这是供企业使用，但文本似乎都很准确。到目前为止我的脚本：

def dlFiles(dlLink):

    print "-----------------------"
    counter = 0
    fileName = []

    for item in dlLink:
        try:
            url = "constructed download URL"
            f = urllib2.urlopen(url)
            m = re.search("filename\*=utf-8\'\'(.*)", str(f.info()['Content-Disposition']))
            fileName.append(m.groups()[0])
            print "Downloading " + str(fileName[counter])
            #print f.info()

            with open(os.path.basename(fileName[counter]), "wb") as local_file:
                local_file.write(f.read())

        except HTTPError, e:
            print "HTTP Error:", e.code, url

        except URLError, e:
            print "URL Error:", e.reason, url

        counter += 1

    return fileName

返回的标题是：

Date: Thu, 17 Jul 2014 19:09:45 GMT
Server: Apache
Content-Disposition: attachment; filename*=utf-8''redacted.zip
Content-Length: 0
Connection: close
Content-Type: application/x-download
X-Proxy-Host: redacted

正如您所看到的，内容长度列为0.我通过Chrome（0内容长度标题）下载时得到相同的标题结果，导致没有文件被下载（只是适当命名，空文件）保存到磁盘）。考虑到它可能是重定向问题，我尝试使用请求库但得到了相同的结果。

它可能与代理主机有关吗？处理可能通过Selenium发生的任何类型的重定向更为可行吗？欢迎所有提示。谢谢！

Python文件下载返回零大小

0 个答案: