为什么Python生成的ELF与原始源代码相比很大?

时间:2018-12-27 14:06:04

标签: python c

我不禁想知道为什么Python生产的ELF与原始源代码相比有很大的优势。让我们看一下最简单的代码hello world。

user@linux:~/Python$ cat hello.py    
print('Hello, World!')
user@linux:~/Python$ 

使用pyinstaller转换为ELF

user@linux:~/Python$ pyinstaller -F hello.py 
48 INFO: PyInstaller: 3.4
49 INFO: Python: 3.6.7
50 INFO: Platform: Linux-4.15.0-38-generic-x86_64-with-Ubuntu-18.04-bionic
50 INFO: wrote /home/user/Python/hello.spec
53 INFO: UPX is not available.
54 INFO: Extending PYTHONPATH with paths
['/home/user/Python', '/home/user/Python']
55 INFO: checking Analysis
60 INFO: Building because _python_version changed
60 INFO: Initializing module dependency graph...
62 INFO: Initializing module graph hooks...
64 INFO: Analyzing base_library.zip ...
3061 INFO: running Analysis Analysis-00.toc
3096 INFO: Caching module hooks...
3100 INFO: Analyzing /home/user/Python/hello.py
3103 INFO: Loading module hooks...
3104 INFO: Loading module hook "hook-encodings.py"...
3169 INFO: Loading module hook "hook-pydoc.py"...
3170 INFO: Loading module hook "hook-xml.py"...
3388 INFO: Looking for ctypes DLLs
3388 INFO: Analyzing run-time hooks ...
3394 INFO: Looking for dynamic libraries
3632 INFO: Looking for eggs
3633 INFO: Python library not in binary dependencies. Doing additional searching...
3684 INFO: Using Python library /usr/lib/x86_64-linux-gnu/libpython3.6m.so.1.0
3695 INFO: Warnings written to /home/user/Python/build/hello/warn-hello.txt
3717 INFO: Graph cross-reference written to /home/user/Python/build/hello/xref-hello.html
3722 INFO: checking PYZ
3725 INFO: Building because toc changed
3725 INFO: Building PYZ (ZlibArchive) /home/user/Python/build/hello/PYZ-00.pyz
4053 INFO: Building PYZ (ZlibArchive) /home/user/Python/build/hello/PYZ-00.pyz completed successfully.
4059 INFO: checking PKG
4064 INFO: Building because toc changed
4064 INFO: Building PKG (CArchive) PKG-00.pkg
6474 INFO: Building PKG (CArchive) PKG-00.pkg completed successfully.
6476 INFO: Bootloader /home/user/.local/lib/python3.6/site-packages/PyInstaller/bootloader/Linux-64bit/run
6477 INFO: checking EXE
6479 INFO: Rebuilding EXE-00.toc because hello missing
6480 INFO: Building EXE from EXE-00.toc
6481 INFO: Appending archive to ELF section in EXE /home/user/Python/dist/hello
6516 INFO: Building EXE from EXE-00.toc completed successfully.
user@linux:~/Python$ 

新的ELF格式

user@linux:~/Python/dist$ ./hello 
Hello, World!
user@linux:~/Python/dist$ 

user@linux:~/Python$ ls -lh hello.py   
-rw-rw-r-- 1 user user 23 Dis  27 21:43 hello.py
user@linux:~/Python$ 

user@linux:~/Python/dist$ ls -lh hello 
-rwxr-xr-x 1 user user 5.3M Dis  27 21:48 hello
user@linux:~/Python/dist$ 

如您所见,原始代码只有23个字节,而ELF则要大得多……5.3M !!!

让我们看一下C语言的另一个例子。

user@linux:~/C$ cat hello.c  
#include<stdio.h>

int main()
{
    printf("Hello C World\n");
}
user@linux:~/C$ 

user@linux:~/C$ gcc hello.c -o helloC
user@linux:~/C$ 

user@linux:~/C$ ls -l helloC
-rwxrwxr-x 1 user user 8304 Dis  27 21:53 helloC
user@linux:~/C$ 

user@linux:~/C$ ./helloC
Hello C World
user@linux:~/C$ 

user@linux:~/C$ ls -l hello.c
-rw-rw-r-- 1 user user 65 Dis  27 21:52 hello.c
user@linux:~/C$ 

user@linux:~/C$ ls -lh helloC
-rwxrwxr-x 1 user user 8.2K Dis  27 21:53 helloC
user@linux:~/C$ 

比较

Python code size = 27 bytes
Python ELF size = 5.3M

C code size = 65 bytes
C ELF size = 8.2K

是否可以缩小尺寸?

2 个答案:

答案 0 :(得分:5)

因为Python不会编译为机器代码。

PyInstaller创建的ELF就像您的代码打包了以及所有必要的Python运行时文件一样简单。它在任何方面都无法与C语言中的编译二进制文件相提并论,后者包含机器代码和动态链接库(例如libc.so)。

答案 1 :(得分:2)

PyInstaller,py2exe和几乎所有其他将Python文件“转换”为可执行文件的项目,实际上并没有转换任何东西-它只是包装了完整的Python解释器-我的机器上只有4.4 MB-以及您的项目和所需的所有依赖项它将它(全部编译成字节码,由解释程序运行)成一个自解压的可执行文件,因此正常情况下,它的大小至少要与(压缩的)Python安装大小一样。

除了Python解释器本身和强大的本地依赖项(例如numpy,scipy,PyQt)外,几乎所有内容在最终的可执行文件大小中都几乎没有。您可能有一个10KLOC Python项目,并且只要不引入任何其他外部依赖关系,您就会发现最终的可执行文件大小不会受到显着影响。

gcc正在编译C文件,而是创建一个实际的可执行文件,其中包含导入和调用printf所必需的机器代码。它是15个字节的文字字符串,少数几个字节用于设置堆栈框架并实际调用printf,其余的都是ELF标头,导入表和各种链接器垃圾(即使只是在其上执行strip -s刮掉2 KB)。