我刚看过Philip Guo的this Youtube lecture关于CPython Internals的事情,我对一件事情感到困惑。
在25:55,他通过在无限循环开始时插入printf(“hello\n”)
来修改CPython的C源代码,该循环运行所有字节代码指令;你也可以这样做:
Python/ceval.c
for (;;) {
printf('hello\n');
添加为无限循环的第一行。configure
和make
以构建Python二进制文件。他写了3行test.py:
X = 1
Y = 2
print X + Y
谜题是,当他用修改过的解释器运行test.py时,在看到“3”之前怎么会有这么多“你好”?
那3行代码应该只编译成几个字节的代码指令,加载值1,加载值2和调用print的指令,所以我想在执行从test.py编译的字节代码时,我们应该只看几个“你好”。
因此,在编译外部Python脚本之前,编译器实际上会生成许多内部字节代码指令吗?
答案 0 :(得分:2)
有两个原因你看到很多hello
被打印出来了:
-v
开关运行常规Python解释器,以查看每次导入的内容。每个模块都包含多个语句,因此在进入正在运行的小脚本之前,需要完成一些字节码。如果我将这3行放入test.py
并使用我未经修改的Python 2.7二进制文件运行,使用-v
开关,我看到:
$ python2.7 -v test.py
# installing zipimport hook
import zipimport # builtin
# installed zipimport hook
# /..../lib/python2.7/site.pyc matches /..../lib/python2.7/site.py
import site # precompiled from /..../lib/python2.7/site.pyc
# /..../lib/python2.7/os.pyc matches /..../lib/python2.7/os.py
import os # precompiled from /..../lib/python2.7/os.pyc
import errno # builtin
import posix # builtin
# /..../lib/python2.7/posixpath.pyc matches /..../lib/python2.7/posixpath.py
import posixpath # precompiled from /..../lib/python2.7/posixpath.pyc
# /..../lib/python2.7/stat.pyc matches /..../lib/python2.7/stat.py
import stat # precompiled from /..../lib/python2.7/stat.pyc
# /..../lib/python2.7/genericpath.pyc matches /..../lib/python2.7/genericpath.py
import genericpath # precompiled from /..../lib/python2.7/genericpath.pyc
# /..../lib/python2.7/warnings.pyc matches /..../lib/python2.7/warnings.py
import warnings # precompiled from /..../lib/python2.7/warnings.pyc
# /..../lib/python2.7/linecache.pyc matches /..../lib/python2.7/linecache.py
import linecache # precompiled from /..../lib/python2.7/linecache.pyc
# /..../lib/python2.7/types.pyc matches /..../lib/python2.7/types.py
import types # precompiled from /..../lib/python2.7/types.pyc
# /..../lib/python2.7/UserDict.pyc matches /..../lib/python2.7/UserDict.py
import UserDict # precompiled from /..../lib/python2.7/UserDict.pyc
# /..../lib/python2.7/_abcoll.pyc matches /..../lib/python2.7/_abcoll.py
import _abcoll # precompiled from /..../lib/python2.7/_abcoll.pyc
# /..../lib/python2.7/abc.pyc matches /..../lib/python2.7/abc.py
import abc # precompiled from /..../lib/python2.7/abc.pyc
# /..../lib/python2.7/_weakrefset.pyc matches /..../lib/python2.7/_weakrefset.py
import _weakrefset # precompiled from /..../lib/python2.7/_weakrefset.pyc
import _weakref # builtin
# /..../lib/python2.7/copy_reg.pyc matches /..../lib/python2.7/copy_reg.py
import copy_reg # precompiled from /..../lib/python2.7/copy_reg.pyc
import encodings # directory /..../lib/python2.7/encodings
# /..../lib/python2.7/encodings/__init__.pyc matches /..../lib/python2.7/encodings/__init__.py
import encodings # precompiled from /..../lib/python2.7/encodings/__init__.pyc
# /..../lib/python2.7/codecs.pyc matches /..../lib/python2.7/codecs.py
import codecs # precompiled from /..../lib/python2.7/codecs.pyc
import _codecs # builtin
# /..../lib/python2.7/encodings/aliases.pyc matches /..../lib/python2.7/encodings/aliases.py
import encodings.aliases # precompiled from /..../lib/python2.7/encodings/aliases.pyc
# /..../lib/python2.7/encodings/utf_8.pyc matches /..../lib/python2.7/encodings/utf_8.py
import encodings.utf_8 # precompiled from /..../lib/python2.7/encodings/utf_8.pyc
Python 2.7.15 (default, May 7 2018, 17:08:03)
[GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
3
# -- clean-up output omitted --
其中的每个import ...
行都引用了内置模块(Python二进制文件的一部分,用C实现)或.pyc
字节码缓存文件。在脚本代码运行之前,有<17>这样的文件被导入。
主脚本中的3行代码转换为另外9个字节码指令:
>>> import dis
>>> dis.dis(compile(r'''\
... X = 1
... Y = 2
... print X + Y
... ''', '', 'exec'))
2 0 LOAD_CONST 0 (1)
3 STORE_NAME 0 (X)
3 6 LOAD_CONST 1 (2)
9 STORE_NAME 1 (Y)
4 12 LOAD_NAME 0 (X)
15 LOAD_NAME 1 (Y)
18 BINARY_ADD
19 PRINT_ITEM
20 PRINT_NEWLINE
21 LOAD_CONST 2 (None)
24 RETURN_VALUE
(我忽略了最后的2个字节码,编码了一个不适用于模块的额外return None
。