我尝试了两种下载XML文件的方法:
import requests
from tqdm import tqdm
url = "http://software.broadinstitute.org/gsea/msigdb/download_file.jsp?filePath=/resources/msigdb/6.2/msigdb_v6.2.xml"
response = requests.get(url, stream=True)
with open("lol.xml", "wb") as handle:
for data in tqdm(response.iter_content()):
handle.write(data)
和第二个:
import urllib2
response = urllib2.urlopen(url)
data = response.read()
print(data)
URL重定向到:
response.url
u'https://software.broadinstitute.org/gsea/login.jsp;jsessionid=2544FF431CB094FBBA80451EDD3A0411'
事实证明,我只下载html文件,而不下载xml文件,这是输出的代码段:
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<base href="http://software.broadinstitute.org/gsea/" />
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
<meta name="verify-v1" content="/23Jlayki9tnRqU7DcCYrbFI7zPmHJ3HfeZltM6mK5Q=" />
<title>GSEA | Login</title>
<link href="css/style.css" rel="stylesheet" type="text/css" />
</head>
如何下载XML文件?
答案 0 :(得分:0)
尝试
file.write(response.content)
在您的第一种方法中代替最后两行。 问题可能是您不能直接从此链接下载文件(需要登录):
url = "http://software.broadinstitute.org/gsea/msigdb/download_file.jsp?filePath=/resources/msigdb/6.2/msigdb_v6.2.xml"