Question

假设我有一个这样的模块文件：

# my_module.py
print("hello")

然后我有一个简单的脚本：

# my_script.py
import my_module

这将打印"hello"。

假设我想“覆盖”print()函数，所以它会返回"world"。我怎么能以编程方式执行此操作（无需手动修改my_module.py）？

我认为在导入之前或导入时我需要以某种方式修改my_module的源代码。显然，我在导入后无法执行此操作，因此使用unittest.mock的解决方案是不可能的。

我还以为我可以读取文件my_module.py，执行修改，然后加载它。但这很难看，因为如果模块位于其他地方，它将无法工作。

我认为，好的解决方案是使用importlib。

我阅读了文档并找到了一个非常相交的方法：get_source(fullname)。我以为我可以覆盖它：

def get_source(fullname):
    source = super().get_source(fullname)
    source = source.replace("hello", "world")
    return source

不幸的是，我对所有这些抽象类有点迷失，我不知道如何正确执行此操作。

我徒劳地试过：

spec = importlib.util.find_spec("my_module")
spec.loader.get_source = mocked_get_source
module = importlib.util.module_from_spec(spec)

欢迎任何帮助。

Answer 1

这是我根据this great talk的内容一起入侵的解决方案。它允许在导入指定模块之前对源进行任意修改。只要幻灯片没有省略任何重要内容，它应该是合理正确的。这只适用于Python 3.5 +。

import importlib
import sys

def modify_and_import(module_name, package=None, modification_func):
    spec = importlib.util.find_spec(module_name, package)
    source = spec.loader.get_source(module_name)
    new_source = modification_func(source)
    module = importlib.util.module_from_spec(spec)
    codeobj = compile(new_source, module.__spec__.origin, 'exec')
    exec(codeobj, module.__dict__)
    sys.modules[module_name] = module
    return module

所以，使用它可以做到

my_module = modify_and_import("my_module", None, lambda src: src.replace("hello", "world"))

Answer 2

这不能回答动态修改导入模块的源代码的一般问题，但是对于“覆盖”或“猴子补丁”，可以使用print()函数（因为它是Python 3.x中的内置函数。方法如下：

#!/usr/bin/env python3
# my_script.py

import builtins

_print = builtins.print

def my_print(*args, **kwargs):
    _print('In my_print: ', end='')
    return _print(*args, **kwargs)

builtins.print = my_print

import my_module  # -> In my_print: hello

Answer 3

如果在修补之前导入模块是可以的，那么可能的解决方案是

import inspect

import my_module

source = inspect.getsource(my_module)
new_source = source.replace('"hello"', '"world"')
exec(new_source, my_module.__dict__)

如果你是在一个更通用的解决方案之后，那么你也可以看一下我前一段时间在another answer中使用的方法。

Answer 4

我首先需要更好地理解import操作。幸运的是，这在the importlib documentation中得到了很好的解释，并且the source code也在帮助。

此import进程实际上分为两部分。首先，finder负责解析模块名称（包括点分隔的包）并实例化适当的loader。实际上，例如，内置不会作为本地模块导入。然后，根据查找程序返回的内容调用加载程序。此加载程序从文件或缓存中获取源，并在先前未加载模块时执行代码。

这很简单。这解释了为什么我实际上不需要使用importutil.abc中的抽象类：我不想提供自己的导入过程。相反，我可以创建一个继承自importuil.machinery的其中一个类的子类，并覆盖get_source()中的SourceFileLoader。但是，这不是要走的路，因为加载器是由finder实例化的，因此我没有使用哪个类。我无法指定应该使用我的子类。

因此，最好的解决方案是让finder完成它的工作，然后替换已经实例化的Loader的get_source()方法。

不幸的是，通过查看代码源，我发现基本的Loaders没有使用get_source()（仅由inspect模块使用）。所以我的整个想法都行不通。

最后，我想应该手动调用get_source()，然后修改返回的源代码，最后执行代码。这就是Martin Valgur在his answer中详述的内容。

如果需要与Python 2的兼容性，除了阅读源文件之外别无他法：

import imp
import sys
import types

module_name = "my_module"

file, pathname, description = imp.find_module(module_name)

with open(pathname) as f:
    source = f.read()

source = source.replace('hello', 'world')

module = types.ModuleType(module_name)
exec(source, module.__dict__)

sys.modules[module_name] = module

Answer 5

不优雅，但对我有用（可能需要添加路径）：

with open ('my_module.py') as aFile:
    exec (aFile.read () .replace (<something>, <something else>))

如何即时修改导入的源代码？

5 个答案: