Question

我有一个包含子包的包，其中只有一个我需要在运行时导入 - 但我需要测试它们是否有效。这是我的文件夹结构：

game/
 __init__.py
 game1/
   __init__.py
   constants.py
   ...
 game2/
   __init__.py
   constants.py
   ...

现在，在启动时运行的代码执行：

import pkgutil
import game as _game
# Detect the known games
for importer,modname,ispkg in pkgutil.iter_modules(_game.__path__):
    if not ispkg: continue # game support modules are packages
    # Equivalent of "from game import <modname>"
    try:
        module = __import__('game',globals(),locals(),[modname],-1)
    except ImportError:
        deprint(u'Error in game support module:', modname, traceback=True)
        continue
    submod = getattr(module,modname)
    if not hasattr(submod,'fsName') or not hasattr(submod,'exe'): continue
    _allGames[submod.fsName.lower()] = submod

但是这样做的缺点是导入了所有子包，导入子包中的其他模块（例如constants.py等），这些模块相当于几兆字节的垃圾。所以我想用一个子模块有效的测试替换这个代码（他们将导入正常）。我想我应该以某种方式使用eval - 但是怎么样？或者我该怎么办？

编辑： tldr;

我正在寻找与上述循环核心相当的东西：

    try:
        probaly_eval(game, modname) # fails iff `from game import modname` fails
        # but does _not_ import the module
    except: # I'd rather have a more specific error here but methinks not possible
        deprint(u'Error in game support module:', modname, traceback=True)
        continue

所以我想要一个明确的答案，如果存在与导入语句的完全等效的与错误检查 - 没有导入模块。这是我的问题，很多回答者和评论者回答了不同的问题。

Answer 1

如果您想在不导入文件的情况下编译文件（ in current interpreter ），您可以使用py_compile.compile作为：

>>> import py_compile

# valid python file
>>> py_compile.compile('/path/to/valid/python/file.py')

# invalid python file
>>> py_compile.compile('/path/to/in-valid/python/file.txt')
Sorry: TypeError: compile() expected string without null bytes

上面的代码将错误写入std.error。如果您要引发异常，则必须将doraise设置为True（默认False）。因此，您的代码将是：

from py_compile import compile, PyCompileError

try:
    compile('/path/to/valid/python/file.py', doraise=True)
    valid_file = True
except PyCompileError:
    valid_file = False

根据py_compile.compile's documents：

将源文件编译为字节码并写出字节码缓存文件。源代码从名为file的文件加载。字节码写入cfile，默认为文件+ 'c'（如果在当前解释器中启用了优化，则为“o”）。如果指定了dfile，则将其用作错误消息中的源文件的名称而不是文件。如果doraise为真，则在编译文件时遇到错误时会引发PyCompileError。如果doraise为false（默认值），则会将错误字符串写入sys.stderr，但不会引发异常。

检查以确保未导入已编译的模块 （在当前解释器中）：

>>> import py_compile, sys
>>> py_compile.compile('/path/to/main.py')

>>> print [key for key in locals().keys() if isinstance(locals()[key], type(sys)) and not key.startswith('__')]
['py_compile', 'sys']  # main not present

Answer 2

也许您正在寻找agrep 'pattern1;pattern2' *.*或py_compile模块这里的文件：
https://docs.python.org/2/library/py_compile.html
https://docs.python.org/2/library/compileall.html#module-compileall

您可以加载您想要的模块，并在程序中调用它例如：

compileall

Answer 3

你无法有效地做你想做的事。为了查看包是否有效＆＃34;，您需要运行它 - 而不仅仅是检查它是否存在 - 因为它可能有错误或未满足的依赖关系。

使用pycompile和compileall只会测试是否可以编译python 文件，而不是导入模块。两者之间存在很大差异。

这种方法意味着您知道模块的实际文件结构 - import foo可以代表/foo.py或/foo/__init__.py。
这种方法并不能保证模块位于解释器的python路径中，或者是解释器加载的模块。如果您在/site-packages/中有多个版本，或者python正在查找模块的许多可能位置之一，那么事情会变得棘手。
仅仅因为您的文件＆＃34;编译＆＃34;并不意味着它会＆＃34;运行＆＃34;。作为一个软件包，它可能会出现未满足的依赖，甚至会引发错误。

想象一下这是你的python文件：

 from makebelieve import nothing
 raise ValueError("ABORT")

上面将编译，但如果你导入它们......如果你没有安装makebelieve它会引发一个ImportError，如果你这样做会引发一个ValueError。

我的建议是：

导入包然后卸载模块。要卸载它们，只需迭代sys.modules.keys()中的内容。如果您担心加载的外部模块，可以覆盖import以记录您的软件包加载的内容。这方面的一个例子是在我写的可怕的分析包中：https://github.com/jvanasco/import_logger [我忘记了我想要覆盖导入的想法。也许celery？]正如一些人所指出的，卸载模块完全依赖于解释器 - 但几乎每个选项都有许多缺点。
使用子进程通过popen启动新的解释器。即popen('python', '-m', 'module_name')。如果你对每个需要的模块执行此操作（每个解释器和导入的开销），这会产生很多开销，但是你可以写一个＆＃34; .py＆＃34;导入所需内容的文件，然后尝试运行它。在任何一种情况下，您都必须分析输出 - 导入＆＃34;有效＆＃34;包可能导致执行期间可接受的错误。我不记得子进程是否继承了你的环境变量，但我相信它确实如此。子进程是一个全新的操作系统进程/解释器，因此模块将被加载到那些短暂的进程中。 memory.clarified答案。

Answer 4

我相信imp.find_module至少满足您的一些要求：https://docs.python.org/2/library/imp.html#imp.find_module

快速测试显示它不会触发导入：

>>> import imp
>>> import sys
>>> len(sys.modules)
47
>>> imp.find_module('email')
(None, 'C:\\Python27\\lib\\email', ('', '', 5))
>>> len(sys.modules)
47
>>> import email
>>> len(sys.modules)
70

以下是我的部分代码（尝试对模块进行分类）的示例用法：https://github.com/asottile/aspy.refactor_imports/blob/2b9bf8bd2cf22ef114bcc2eb3e157b99825204e0/aspy/refactor_imports/classify.py#L38-L44

Answer 5

我们已经有了custom importer（免责声明：我没有写那个代码我只是当前的维护者）load_module：

def load_module(self,fullname):
    if fullname in sys.modules:
        return sys.modules[fullname]
    else: # set to avoid reimporting recursively
        sys.modules[fullname] = imp.new_module(fullname)
    if isinstance(fullname,unicode):
        filename = fullname.replace(u'.',u'\\')
        ext = u'.py'
        initfile = u'__init__'
    else:
        filename = fullname.replace('.','\\')
        ext = '.py'
        initfile = '__init__'
    try:
        if os.path.exists(filename+ext):
            with open(filename+ext,'U') as fp:
                mod = imp.load_source(fullname,filename+ext,fp)
                sys.modules[fullname] = mod
                mod.__loader__ = self
        else:
            mod = sys.modules[fullname]
            mod.__loader__ = self
            mod.__file__ = os.path.join(os.getcwd(),filename)
            mod.__path__ = [filename]
            #init file
            initfile = os.path.join(filename,initfile+ext)
            if os.path.exists(initfile):
                with open(initfile,'U') as fp:
                    code = fp.read()
                exec compile(code, initfile, 'exec') in mod.__dict__
        return mod
    except Exception as e: # wrap in ImportError a la python2 - will keep
        # the original traceback even if import errors nest
        print 'fail', filename+ext
        raise ImportError, u'caused by ' + repr(e), sys.exc_info()[2]

所以我认为我可以使用可覆盖的方法替换访问sys.modules缓存的部分，这些方法会覆盖该缓存：

所以：

@@ -48,2 +55,2 @@ class UnicodeImporter(object):
-        if fullname in sys.modules:
-            return sys.modules[fullname]
+        if self._check_imported(fullname):
+            return self._get_imported(fullname)
@@ -51 +58 @@ class UnicodeImporter(object):
-            sys.modules[fullname] = imp.new_module(fullname)
+            self._add_to_imported(fullname, imp.new_module(fullname))
@@ -64 +71 @@ class UnicodeImporter(object):
-                    sys.modules[fullname] = mod
+                    self._add_to_imported(fullname, mod)
@@ -67 +74 @@ class UnicodeImporter(object):
-                mod = sys.modules[fullname]
+                mod = self._get_imported(fullname)

并定义：

class FakeUnicodeImporter(UnicodeImporter):

    _modules_to_discard = {}

    def _check_imported(self, fullname):
        return fullname in sys.modules or fullname in self._modules_to_discard

    def _get_imported(self, fullname):
        try:
            return sys.modules[fullname]
        except KeyError:
            return self._modules_to_discard[fullname]

    def _add_to_imported(self, fullname, mod):
        self._modules_to_discard[fullname] = mod

    @classmethod
    def cleanup(cls):
        cls._modules_to_discard.clear()

然后我在sys.meta_path中添加了导入器，并且很高兴：

importer = sys.meta_path[0]
try:
    if not hasattr(sys,'frozen'):
        sys.meta_path = [fake_importer()]
    perform_the_imports() # see question
finally:
    fake_importer.cleanup()
    sys.meta_path = [importer]

对吗？错！

Traceback (most recent call last):
  File "bash\bush.py", line 74, in __supportedGames
    module = __import__('game',globals(),locals(),[modname],-1)
  File "Wrye Bash Launcher.pyw", line 83, in load_module
    exec compile(code, initfile, 'exec') in mod.__dict__
  File "bash\game\game1\__init__.py", line 29, in <module>
    from .constants import *
ImportError: caused by SystemError("Parent module 'bash.game.game1' not loaded, cannot perform relative import",)

嗯？我目前正在导入同一个模块。答案可能在import's docs

如果在缓存中找不到模块，则搜索sys.meta_path（可在PEP 302中找到sys.meta_path的规范）。

这不完全是关键，但我猜测是语句from .constants import * 查找sys.modules 以检查是否父模块在那里，我认为没有办法绕过它（注意我们的自定义加载器正在使用模块的内置导入机制，mod.__loader__ = self在事实之后设置。）

所以我更新了我的FakeImporter以使用sys.modules缓存然后清理它。

class FakeUnicodeImporter(UnicodeImporter):

    _modules_to_discard = set()

    def _check_imported(self, fullname):
        return fullname in sys.modules or fullname in self._modules_to_discard

    def _add_to_imported(self, fullname, mod):
        super(FakeUnicodeImporter, self)._add_to_imported(fullname, mod)
        self._modules_to_discard.add(fullname)

    @classmethod
    def cleanup(cls):
        for m in cls._modules_to_discard: del sys.modules[m]

然而，这会以一种新的方式 - 或者说两种方式 - 吹嘘：

对游戏/包的引用保存在sys.modules中的bash顶级包实例中：
```
bash\
  __init__.py
  the_code_in_question_is_here.py
  game\
    ...
```
因为game导入为bash.game。该引用引用了所有game1, game2,...子包，因此这些子包从未被垃圾回收
对另一个模块（brec）的引用由bash.brec模块实例保存为bash。此引用在game \ game1 中导入为from .. import brec，而不会触发导入，以更新SomeClass。但是，在另一个模块中，导入from ...brec import SomeClass 表单会触发导入，并且另一个实例的brec模块已结束在sys.modules中。该实例有一个未更新的SomeClass并且引发了一个AttributeError。

通过手动删除这些引用来解决这两个问题 - 所以gc收集了所有模块（75个ram中的5 mbytes），而from .. import brec确实触发了导入（from ... import foo vs {{1}保证一个问题）。

故事的寓意是有可能但是：

包和子包应该只相互引用
应从顶级包属性中删除对外部模块/包的所有引用
应从顶级包属性

如果这听起来很复杂并且容易出错 - 至少现在我对相互依赖性及其危险性有了更清晰的认识 - 是解决问题的时候了。

这篇文章是由Pydev的调试器赞助的 - 我发现from ...foo import bar模块非常有用，可以解决发生的事情 - 来自here的提示。当然，有很多变量是调试器和复杂的东西

如何在不导入的情况下检查运行时python模块是否有效？

5 个答案: