如何确保下载的页面由python完成

时间:2019-01-27 16:49:46

标签: python-3.x python-requests urllib

我正在下载一个基于JSON的大页面,大多数情况下,它已成功下载,但有时会部分下载。我如何确定下载已完成。

我的示例代码如下:

~$ ipython3
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/IPython/__init__.py", line 48, in <module>
    from .core.application import Application
  File "/usr/lib/python3/dist-packages/IPython/core/application.py", line 25, in <module>
    from IPython.core import release, crashhandler
  File "/usr/lib/python3/dist-packages/IPython/core/crashhandler.py", line 28, in <module>
    from IPython.core import ultratb
  File "/usr/lib/python3/dist-packages/IPython/core/ultratb.py", line 124, in <module>
    from IPython.utils import path as util_path
  File "/usr/lib/python3/dist-packages/IPython/utils/path.py", line 18, in <module>
    from IPython.utils.process import system
  File "/usr/lib/python3/dist-packages/IPython/utils/process.py", line 19, in <module>
    from ._process_posix import system, getoutput, arg_split, check_pid
  File "/usr/lib/python3/dist-packages/IPython/utils/_process_posix.py", line 24, in <module>
    import pexpect
  File "/usr/lib/python3/dist-packages/pexpect/__init__.py", line 75, in <module>
    from .pty_spawn import spawn, spawnu
  File "/usr/lib/python3/dist-packages/pexpect/pty_spawn.py", line 14, in <module>
    from .spawnbase import SpawnBase
  File "/usr/lib/python3/dist-packages/pexpect/spawnbase.py", line 224
    def expect(self, pattern, timeout=-1, searchwindowsize=-1, async=False):
                                                                   ^
SyntaxError: invalid syntax

很遗憾,我无法通过Try-Except捕获部分下载。 然后我的代码中断了,因为它没有捕获所有需要的数据。 有什么办法可以完全了解该页面的加载情况? 非常感谢

1 个答案:

答案 0 :(得分:1)

[摘自评论]

您可以根据返回的“ Content-Length”标头对读取的字符串进行大小检查。如果已检索到所有数据,则两个大小应一致。