用distutils加速构建过程

时间:2012-06-13 11:25:22

标签: c++ python distutils

我正在编写Python的C ++扩展,我正在使用distutils来编译项目。随着项目的发展,重建它需要更长时间。有没有办法加快构建过程?

我读到了distutils无法实现并行构建(与make -j一样)。对于可能更快的distutils有什么好的替代方案吗?

我还注意到,每次调用python setup.py build时,它都会重新编译所有目标文件,即使我只更改了一个源文件。应该是这种情况还是我可能在这里做错了什么?

如果有帮助,以下是我尝试编译的一些文件:https://gist.github.com/2923577

谢谢!

4 个答案:

答案 0 :(得分:32)

  1. 尝试使用环境变量CC="ccache gcc"构建,这将在源未更改时显着加快构建速度。 (奇怪的是,distutils也对c ++源文件使用CC)。当然,安装ccache包。

  2. 由于您有一个单个扩展名,它是由多个编译对象文件组合而成的,因此您可以对distutils进行猴子补丁以并行编译它们(它们是独立的) ) - 将其放入您的setup.py(根据需要调整N=2):

    # monkey-patch for parallel compilation
    def parallelCCompile(self, sources, output_dir=None, macros=None, include_dirs=None, debug=0, extra_preargs=None, extra_postargs=None, depends=None):
        # those lines are copied from distutils.ccompiler.CCompiler directly
        macros, objects, extra_postargs, pp_opts, build = self._setup_compile(output_dir, macros, include_dirs, sources, depends, extra_postargs)
        cc_args = self._get_cc_args(pp_opts, debug, extra_preargs)
        # parallel code
        N=2 # number of parallel compilations
        import multiprocessing.pool
        def _single_compile(obj):
            try: src, ext = build[obj]
            except KeyError: return
            self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
        # convert to list, imap is evaluated on-demand
        list(multiprocessing.pool.ThreadPool(N).imap(_single_compile,objects))
        return objects
    import distutils.ccompiler
    distutils.ccompiler.CCompiler.compile=parallelCCompile
    
  3. 为了完整起见,如果您有多个扩展程序,则可以使用以下解决方案:

    import os
    import multiprocessing
    try:
        from concurrent.futures import ThreadPoolExecutor as Pool
    except ImportError:
        from multiprocessing.pool import ThreadPool as LegacyPool
    
        # To ensure the with statement works. Required for some older 2.7.x releases
        class Pool(LegacyPool):
            def __enter__(self):
                return self
    
            def __exit__(self, *args):
                self.close()
                self.join()
    
    def build_extensions(self):
        """Function to monkey-patch
        distutils.command.build_ext.build_ext.build_extensions
    
        """
        self.check_extensions_list(self.extensions)
    
        try:
            num_jobs = os.cpu_count()
        except AttributeError:
            num_jobs = multiprocessing.cpu_count()
    
        with Pool(num_jobs) as pool:
            pool.map(self.build_extension, self.extensions)
    
    def compile(
        self, sources, output_dir=None, macros=None, include_dirs=None,
        debug=0, extra_preargs=None, extra_postargs=None, depends=None,
    ):
        """Function to monkey-patch distutils.ccompiler.CCompiler"""
        macros, objects, extra_postargs, pp_opts, build = self._setup_compile(
            output_dir, macros, include_dirs, sources, depends, extra_postargs
        )
        cc_args = self._get_cc_args(pp_opts, debug, extra_preargs)
    
        for obj in objects:
            try:
                src, ext = build[obj]
            except KeyError:
                continue
            self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)
    
        # Return *all* object filenames, not just the ones we just built.
        return objects
    
    
    from distutils.ccompiler import CCompiler
    from distutils.command.build_ext import build_ext
    build_ext.build_extensions = build_extensions
    CCompiler.compile = compile
    

答案 1 :(得分:7)

我使用clcache在Windows上工作,源自eudoxos的回答:

# Python modules
import datetime
import distutils
import distutils.ccompiler
import distutils.sysconfig
import multiprocessing
import multiprocessing.pool
import os
import sys

from distutils.core import setup
from distutils.core import Extension
from distutils.errors import CompileError
from distutils.errors import DistutilsExecError

now = datetime.datetime.now

ON_LINUX = "linux" in sys.platform

N_JOBS = 4

#------------------------------------------------------------------------------
# Enable ccache to speed up builds

if ON_LINUX:
    os.environ['CC'] = 'ccache gcc'

# Windows
else:

    # Using clcache.exe, see: https://github.com/frerich/clcache

    # Insert path to clcache.exe into the path.

    prefix = os.path.dirname(os.path.abspath(__file__))
    path = os.path.join(prefix, "bin")

    print "Adding %s to the system path." % path
    os.environ['PATH'] = '%s;%s' % (path, os.environ['PATH'])

    clcache_exe = os.path.join(path, "clcache.exe")

#------------------------------------------------------------------------------
# Parallel Compile
#
# Reference:
#
# http://stackoverflow.com/questions/11013851/speeding-up-build-process-with-distutils
#

def linux_parallel_cpp_compile(
        self,
        sources,
        output_dir=None,
        macros=None,
        include_dirs=None,
        debug=0,
        extra_preargs=None,
        extra_postargs=None,
        depends=None):

    # Copied from distutils.ccompiler.CCompiler

    macros, objects, extra_postargs, pp_opts, build = self._setup_compile(
        output_dir, macros, include_dirs, sources, depends, extra_postargs)

    cc_args = self._get_cc_args(pp_opts, debug, extra_preargs)

    def _single_compile(obj):

        try:
            src, ext = build[obj]
        except KeyError:
            return

        self._compile(obj, src, ext, cc_args, extra_postargs, pp_opts)

    # convert to list, imap is evaluated on-demand

    list(multiprocessing.pool.ThreadPool(N_JOBS).imap(
        _single_compile, objects))

    return objects


def windows_parallel_cpp_compile(
        self,
        sources,
        output_dir=None,
        macros=None,
        include_dirs=None,
        debug=0,
        extra_preargs=None,
        extra_postargs=None,
        depends=None):

    # Copied from distutils.msvc9compiler.MSVCCompiler

    if not self.initialized:
        self.initialize()

    macros, objects, extra_postargs, pp_opts, build = self._setup_compile(
        output_dir, macros, include_dirs, sources, depends, extra_postargs)

    compile_opts = extra_preargs or []
    compile_opts.append('/c')

    if debug:
        compile_opts.extend(self.compile_options_debug)
    else:
        compile_opts.extend(self.compile_options)

    def _single_compile(obj):

        try:
            src, ext = build[obj]
        except KeyError:
            return

        input_opt = "/Tp" + src
        output_opt = "/Fo" + obj
        try:
            self.spawn(
                [clcache_exe]
                + compile_opts
                + pp_opts
                + [input_opt, output_opt]
                + extra_postargs)

        except DistutilsExecError, msg:
            raise CompileError(msg)

    # convert to list, imap is evaluated on-demand

    list(multiprocessing.pool.ThreadPool(N_JOBS).imap(
        _single_compile, objects))

    return objects

#------------------------------------------------------------------------------
# Only enable parallel compile on 2.7 Python

if sys.version_info[1] == 7:

    if ON_LINUX:
        distutils.ccompiler.CCompiler.compile = linux_parallel_cpp_compile

    else:
        import distutils.msvccompiler
        import distutils.msvc9compiler

        distutils.msvccompiler.MSVCCompiler.compile = windows_parallel_cpp_compile
        distutils.msvc9compiler.MSVCCompiler.compile = windows_parallel_cpp_compile

# ... call setup() as usual

答案 2 :(得分:1)

在链接中提供的有限示例中,您似乎对该语言的某些功能存在一些误解。例如,gsminterface.h有很多名称空间级别static,这可能是无意的。包含该标头的每个翻译单元将为该标头中声明的每个符号编译它自己的版本。这样做的副作用不仅是编译时间,还包括代码膨胀(更大的二进制文件)和链接时间,因为链接器需要处理所有这些符号。

仍有许多问题会影响您尚未回答的构建过程,例如,在重新编译之前是否每次都要清理。如果您这样做,那么您可能需要考虑ccache,这是缓存构建过程结果的工具,因此如果您只运行make clean; make target将为任何未更改的翻译单元运行预处理程序。请注意,只要您继续在标题中维护大多数代码,这将无法提供太多优势,因为标题中的更改会修改包含它的所有翻译单元。 (我不知道你的构建系统,所以我不能告诉你python setup.py build是否清理

该项目看起来并不大,所以如果编译花费的时间超过几秒钟,我会感到惊讶。

答案 3 :(得分:1)

如果有Numpy 1.10,则可以轻松完成此操作。只需添加:

 try:
     from numpy.distutils.ccompiler import CCompiler_compile
     import distutils.ccompiler
     distutils.ccompiler.CCompiler.compile = CCompiler_compile
 except ImportError:
     print("Numpy not found, parallel compile not available")

使用-j N或设置NPY_NUM_BUILD_JOBS