我想获取此页面上的信息: http://www.jnfdc.gov.cn/onsaling/viewhouse.shtml?fmid=757e06e0-c5b3-4384-9a14-2cb1eac011d1
从浏览器调试器工具中我获取此文件中的信息: http://www.jnfdc.gov.cn/r/house/757e06e0-c5b3-4384-9a14-2cb1eac011d1_154810896.xml
但是当我使用浏览器直接访问网址时,我无法获取该文件。
我不知道为什么。
我使用python。
import urllib2
#url1 = 'http://www.jnfdc.gov.cn/onsaling/viewhouse.shtml?fmid=757e06e0-c5b3-4384-9a14-2cb1eac011d1'
url = 'http://www.jnfdc.gov.cn/r/house/757e06e0-c5b3-4384-9a14-2cb1eac011d1_113649432.xml'
headers = {
"Accept" :"*/*",
"Accept-Encoding" :"gzip, deflate, sdch",
"Accept-Language" :"zh-CN,zh;q=0.8",
"Cache-Control" :"max-age=0",
"Connection" :"keep-alive",
"Cookie" :"JSESSIONID=A205D8D7B0807FD34F879D6CB6EEB0CE",
"DNT" :"1",
"Host" :"www.jnfdc.gov.cn",
"Referer" :"http://www.jnfdc.gov.cn/onsaling/viewhouse.shtml?fmid=757e06e0-c5b3-4384-9a14-2cb1eac011d1",
"User-Agent" :"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.104 Safari/537.36 Core/1.53.3051.400 QQBrowser/9.6.11301.400"
}
req = urllib2.Request(url, headers=headers)
resp = urllib2.urlopen(req) #this code throw exception:HTTPError: Not Found
我该怎么办?感谢。