CPython:为什么3行脚本需要在解释器中执行多于3个周期?

时间:2018-05-28 16:46:52

标签: python cpython python-internals

我刚看过Philip Guo的this Youtube lecture关于CPython Internals的事情,我对一件事情感到困惑。

在25:55,他通过在无限循环开始时插入printf(“hello\n”)来修改CPython的C源代码,该循环运行所有字节代码指令;你也可以这样做:

  • 下载Python 2.7 C源代码
  • 打开文件Python/ceval.c
  • 找到无尽评估循环的开始,for (;;) {
  • 将行printf('hello\n');添加为无限循环的第一行。
  • 运行configuremake以构建Python二进制文件。

他写了3行test.py:

X = 1
Y = 2
print X + Y

谜题是,当他用修改过的解释器运行test.py时,在看到“3”之前怎么会有这么多“你好”?

那3行代码应该只编译成几个字节的代码指令,加载值1,加载值2和调用print的指令,所以我想在执行从test.py编译的字节代码时,我们应该只看几个“你好”。

因此,在编译外部Python脚本之前,编译器实际上会生成许多内部字节代码指令吗?

1 个答案:

答案 0 :(得分:2)

有两个原因你看到很多hello被打印出来了:

  • Python没有针对每个可能的Python语句的特殊字节码。相反,语句将使用字节码的组合
  • Python解释器导入一系列Python模块只是为了开始运行。您可以使用-v开关运行常规Python解释器,以查看每次导入的内容。每个模块都包含多个语句,因此在进入正在运行的小脚本之前,需要完成一些字节码。

如果我将这3行放入test.py并使用我未经修改的Python 2.7二进制文件运行,使用-v开关,我看到:

$ python2.7 -v test.py
# installing zipimport hook
import zipimport # builtin
# installed zipimport hook
# /..../lib/python2.7/site.pyc matches /..../lib/python2.7/site.py
import site # precompiled from /..../lib/python2.7/site.pyc
# /..../lib/python2.7/os.pyc matches /..../lib/python2.7/os.py
import os # precompiled from /..../lib/python2.7/os.pyc
import errno # builtin
import posix # builtin
# /..../lib/python2.7/posixpath.pyc matches /..../lib/python2.7/posixpath.py
import posixpath # precompiled from /..../lib/python2.7/posixpath.pyc
# /..../lib/python2.7/stat.pyc matches /..../lib/python2.7/stat.py
import stat # precompiled from /..../lib/python2.7/stat.pyc
# /..../lib/python2.7/genericpath.pyc matches /..../lib/python2.7/genericpath.py
import genericpath # precompiled from /..../lib/python2.7/genericpath.pyc
# /..../lib/python2.7/warnings.pyc matches /..../lib/python2.7/warnings.py
import warnings # precompiled from /..../lib/python2.7/warnings.pyc
# /..../lib/python2.7/linecache.pyc matches /..../lib/python2.7/linecache.py
import linecache # precompiled from /..../lib/python2.7/linecache.pyc
# /..../lib/python2.7/types.pyc matches /..../lib/python2.7/types.py
import types # precompiled from /..../lib/python2.7/types.pyc
# /..../lib/python2.7/UserDict.pyc matches /..../lib/python2.7/UserDict.py
import UserDict # precompiled from /..../lib/python2.7/UserDict.pyc
# /..../lib/python2.7/_abcoll.pyc matches /..../lib/python2.7/_abcoll.py
import _abcoll # precompiled from /..../lib/python2.7/_abcoll.pyc
# /..../lib/python2.7/abc.pyc matches /..../lib/python2.7/abc.py
import abc # precompiled from /..../lib/python2.7/abc.pyc
# /..../lib/python2.7/_weakrefset.pyc matches /..../lib/python2.7/_weakrefset.py
import _weakrefset # precompiled from /..../lib/python2.7/_weakrefset.pyc
import _weakref # builtin
# /..../lib/python2.7/copy_reg.pyc matches /..../lib/python2.7/copy_reg.py
import copy_reg # precompiled from /..../lib/python2.7/copy_reg.pyc
import encodings # directory /..../lib/python2.7/encodings
# /..../lib/python2.7/encodings/__init__.pyc matches /..../lib/python2.7/encodings/__init__.py
import encodings # precompiled from /..../lib/python2.7/encodings/__init__.pyc
# /..../lib/python2.7/codecs.pyc matches /..../lib/python2.7/codecs.py
import codecs # precompiled from /..../lib/python2.7/codecs.pyc
import _codecs # builtin
# /..../lib/python2.7/encodings/aliases.pyc matches /..../lib/python2.7/encodings/aliases.py
import encodings.aliases # precompiled from /..../lib/python2.7/encodings/aliases.pyc
# /..../lib/python2.7/encodings/utf_8.pyc matches /..../lib/python2.7/encodings/utf_8.py
import encodings.utf_8 # precompiled from /..../lib/python2.7/encodings/utf_8.pyc
Python 2.7.15 (default, May  7 2018, 17:08:03)
[GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
3
# -- clean-up output omitted --

其中的每个import ...行都引用了内置模块(Python二进制文件的一部分,用C实现)或.pyc字节码缓存文件。在脚本代码运行之前,有<17>这样的文件被导入

主脚本中的3行代码转换为另外9个字节码指令:

>>> import dis
>>> dis.dis(compile(r'''\
... X = 1
... Y = 2
... print X + Y
... ''', '', 'exec'))
  2           0 LOAD_CONST               0 (1)
              3 STORE_NAME               0 (X)

  3           6 LOAD_CONST               1 (2)
              9 STORE_NAME               1 (Y)

  4          12 LOAD_NAME                0 (X)
             15 LOAD_NAME                1 (Y)
             18 BINARY_ADD
             19 PRINT_ITEM
             20 PRINT_NEWLINE
             21 LOAD_CONST               2 (None)
             24 RETURN_VALUE

(我忽略了最后的2个字节码,编码了一个不适用于模块的额外return None