Question

我正在尝试创建一个脚本，以便我可以知道页面何时进行了一些修改（它不能用＃34; Last-Modified＆＃34;因为服务器用以下内容替换文件几天之后另一个完全相同）。所以我试图使用Urllib2.urlopen（）和.read（）方法来获取包含代码的字符串，如下所示：

try:
    file= open(filedir, 'w+')
    web = urllib2.urlopen(url)
    file.write(web.read())
except Error as e:
    print "Some error %s" % e
archivo.close()

工作正常，但是当我尝试下载同一页面时，我得到相同但没有一些标题：

参考文件：

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html lang="es" xmlns="http://www.w3.org/1999/xhtml" xml:lang="es"> <head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>...

但是当谈到新的下载时，只能得到：

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>...

其余的代码是相同的，但我想知道它发生了什么。提前谢谢。

注意：当我在Python控制台或Visual Studio上运行脚本时会发生这种情况，但如果我使用＆＃34; Sublime text＆＃34;运行脚本。工作正常。

Python Urllib2不会返回整个文件

0 个答案: