如何下载XML文件http?

时间:2018-07-25 17:11:22

标签: xml python-2.7 http url web

我尝试了两种下载XML文件的方法:

import requests
from tqdm import tqdm

url = "http://software.broadinstitute.org/gsea/msigdb/download_file.jsp?filePath=/resources/msigdb/6.2/msigdb_v6.2.xml"
response = requests.get(url, stream=True)

with open("lol.xml", "wb") as handle:
    for data in tqdm(response.iter_content()):
        handle.write(data)

和第二个:

import urllib2
response = urllib2.urlopen(url)
data = response.read()
print(data)

URL重定向到:

response.url
u'https://software.broadinstitute.org/gsea/login.jsp;jsessionid=2544FF431CB094FBBA80451EDD3A0411'

事实证明,我只下载html文件,而不下载xml文件,这是输出的代码段:

<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <base href="http://software.broadinstitute.org/gsea/" />
    <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
    <meta name="verify-v1" content="/23Jlayki9tnRqU7DcCYrbFI7zPmHJ3HfeZltM6mK5Q=" />
    <title>GSEA | Login</title>
    <link href="css/style.css" rel="stylesheet" type="text/css" />
</head>

如何下​​载XML文件?

1 个答案:

答案 0 :(得分:0)

尝试

file.write(response.content)

在您的第一种方法中代替最后两行。 问题可能是您不能直接从此链接下载文件(需要登录):

url = "http://software.broadinstitute.org/gsea/msigdb/download_file.jsp?filePath=/resources/msigdb/6.2/msigdb_v6.2.xml"