是否可以通过代码对象访问内部函数和类?

时间:2015-09-16 14:34:23

标签: python python-3.x nested bytecode code-inspection

假设有一个函数func

def func():
    class a:
        def method(self):
            return 'method'
    def a(): return 'function'
    lambda x: 'lambda'

我需要检查。

作为考试的一部分,我想“检索”所有嵌套类和函数的源代码或对象(如果有的话)。但是我确实意识到它们不存在尚未并且没有直接/干净的方式来访问它们而不运行func或定义 他们在func之外(之前)。不幸的是,我能做的最多就是导入一个包含func的模块来获取func函数对象。

我发现函数的__code__属性包含code对象,该对象具有co_consts属性,所以我写了这个:

In [11]: [x for x in func.__code__.co_consts if iscode(x) and x.co_name == 'a']
Out[11]: 
[<code object a at 0x7fe246aa9810, file "<ipython-input-6-31c52097eb5f>", line 2>,
 <code object a at 0x7fe246aa9030, file "<ipython-input-6-31c52097eb5f>", line 4>]

这些code个对象看起来非常相似,我认为它们不包含帮助我区分它们所代表的对象类型所必需的数据(例如typefunction)。

Q1:我是对的吗?

Q2:是否有任何方法来访问函数体中定义的类/函数(普通函数和lambdas函数)?

2 个答案:

答案 0 :(得分:7)

A1:可以帮助你的事情是 -

代码对象的常量

来自documentation

  

如果代码对象代表一个函数,则 co_consts 中的第一项是   函数的文档字符串,如果未定义,则为

此外,如果代码对象表示类,则co_consts的第一项始终是该类的限定名称。您可以尝试使用此信息。

以下解决方案将在大多数情况下正常工作,但您必须跳过Python为list / set / dict comprehensions和generator表达式创建的代码对象:

from inspect import iscode

for x in func.__code__.co_consts:
    if iscode(x):
        # Skip <setcomp>, <dictcomp>, <listcomp> or <genexp>
        if x.co_name.startswith('<') and x.co_name != '<lambda>':
            continue
        firstconst = x.co_consts[0]
        # Compute the qualified name for the current code object
        # Note that we don't know its "type" yet
        qualname = '{func_name}.<locals>.{code_name}'.format(
                        func_name=func.__name__, code_name=x.co_name)
        if firstconst is None or firstconst != qualname:
            print(x, 'represents a function {!r}'.format(x.co_name))
        else:
            print(x, 'represents a class {!r}'.format(x.co_name))

打印

<code object a at 0x7fd149d1a9c0, file "<ipython-input>", line 2> represents a class 'a'
<code object a at 0x7fd149d1ab70, file "<ipython-input>", line 5> represents a function 'a'
<code object <lambda> at 0x7fd149d1aae0, file "<ipython-input>", line 6> represents a function '<lambda>'

代码标记

有从co_flags获取所需信息的方法。引用我上面链接的文档:

  

co_flags 定义了以下标志位:如果设置为0x04,则设置为0x04   该函数使用 * arguments 语法接受任意数字   位置论证;如果函数使用,则设置位0x08    ** keywords 语法接受任意关键字参数;如果函数是生成器,则设置位0x20。

     

co_flags 中的其他位保留供内部使用。

标志在compute_code_flagsPython/compile.c)中被操纵:

static int
compute_code_flags(struct compiler *c)
{
    PySTEntryObject *ste = c->u->u_ste;
    ...
    if (ste->ste_type == FunctionBlock) {
        flags |= CO_NEWLOCALS | CO_OPTIMIZED;
        if (ste->ste_nested)
            flags |= CO_NESTED;
        if (ste->ste_generator)
            flags |= CO_GENERATOR;
        if (ste->ste_varargs)
            flags |= CO_VARARGS;
        if (ste->ste_varkeywords)
            flags |= CO_VARKEYWORDS;
    }

    /* (Only) inherit compilerflags in PyCF_MASK */
    flags |= (c->c_flags->cf_flags & PyCF_MASK);

    n = PyDict_Size(c->u->u_freevars);
    ...
    if (n == 0) {
        n = PyDict_Size(c->u->u_cellvars);
        ...
        if (n == 0) {
            flags |= CO_NOFREE;
        }
    }
    ...
}

没有为类设置的代码标记(CO_NEWLOCALSCO_OPTIMIZED)。您可以使用它们来检查类型(并不意味着您应该 - 未来的实施细节可能会发生变化):

from inspect import iscode

for x in complex_func.__code__.co_consts:
    if iscode(x):
        # Skip <setcomp>, <dictcomp>, <listcomp> or <genexp>
        if x.co_name.startswith('<') and x.co_name != '<lambda>':
            continue
        flags = x.co_flags
        # CO_OPTIMIZED = 0x0001, CO_NEWLOCALS = 0x0002
        if flags & 0x0001 and flags & 0x0002:
            print(x, 'represents a function {!r}'.format(x.co_name))
        else:
            print(x, 'represents a class {!r}'.format(x.co_name))

输出完全相同。

外部函数的字节码

通过检查外部函数的字节码,也可以获得对象类型。

搜索字节码指令以查找LOAD_BUILD_CLASS的块,它表示创建了一个类(LOAD_BUILD_CLASS - Pushes builtins.__build_class__() onto the stack. It is later called by CALL_FUNCTION to construct a class.

from dis import Bytecode
from inspect import iscode
from itertools import groupby

def _group(i):
    if i.starts_line is not None: _group.starts = i
    return _group.starts

bytecode = Bytecode(func)

for _, iset in groupby(bytecode, _group):
    iset = list(iset)
    try:
        code = next(arg.argval for arg in iset if iscode(arg.argval))
        # Skip <setcomp>, <dictcomp>, <listcomp> or <genexp>
        if code.co_name.startswith('<') and code.co_name != '<lambda>':
            raise TypeError
    except (StopIteration, TypeError):
        continue
    else:
        if any(x.opname == 'LOAD_BUILD_CLASS' for x in iset):
            print(code, 'represents a function {!r}'.format(code.co_name))
        else:
            print(code, 'represents a class {!r}'.format(code.co_name)) 

输出相同(再次)。

A2:当然。

源代码

为了获取代码对象的源代码,您可以使用inspect.getsource或等效代码:

from inspect import iscode, ismethod, getsource
from textwrap import dedent


def nested_sources(ob):
    if ismethod(ob):
        ob = ob.__func__
    try:
        code = ob.__code__
    except AttributeError:
        raise TypeError('Can\'t inspect {!r}'.format(ob)) from None
    for c in code.co_consts:
        if not iscode(c):
            continue
        name = c.co_name
        # Skip <setcomp>, <dictcomp>, <listcomp> or <genexp>
        if not name.startswith('<') or name == '<lambda>':
            yield dedent(getsource(c))

例如nested_sources(complex_func)(见下文)

def complex_func():
    lambda x: 42

    def decorator(cls):
        return lambda: cls()

    @decorator
    class b():
        def method():
            pass

    class c(int, metaclass=abc.ABCMeta):
        def method():
            pass

    {x for x in ()}
    {x: x for x in ()}
    [x for x in ()]
    (x for x in ())

必须为第一个lambdadecoratorb(包括@decorator)和c提供源代码:

In [41]: nested_sources(complex_func)
Out[41]: <generator object nested_sources at 0x7fd380781d58>

In [42]: for source in _:
   ....:     print(source, end='=' * 30 + '\n')
   ....:     
lambda x: 42
==============================
def decorator(cls):
    return lambda: cls()
==============================
@decorator
class b():
    def method():
        pass
==============================
class c(int, metaclass=abc.ABCMeta):
    def method():
        pass
==============================

功能和类型对象

如果您仍需要函数/类对象,则可以eval / exec获取源代码。

实施例

  • lambda个函数:

    In [39]: source = sources[0]
    
    In [40]: eval(source, func.__globals__)
    Out[40]: <function __main__.<lambda>>
    
  • 用于常规功能

    In [21]: source, local = sources[1], {}
    
    In [22]: exec(source, func.__globals__, local)
    
    In [23]: local.popitem()[1]
    Out[23]: <function __main__.decorator>
    
  • 用于课程

    In [24]: source, local = sources[3], {}
    
    In [25]: exec(source, func.__globals__, local)
    
    In [26]: local.popitem()[1] 
    Out[26]: __main__.c
    

答案 1 :(得分:0)

Disassemble the x object. x can denote either a module, a class, a method, a function, a generator, an asynchronous generator, a coroutine, a code object, a string of source code or a byte sequence of raw bytecode. For a module, it disassembles all functions. For a class, it disassembles all methods (including class and static methods). For a code object or sequence of raw bytecode, it prints one line per bytecode instruction. It also recursively disassembles nested code objects (the code of comprehensions, generator expressions and nested functions, and the code used for building nested classes). Strings are first compiled to code objects with the compile() built-in function before being disassembled. If no object is provided, this function disassembles the last traceback.

The disassembly is written as text to the supplied file argument if provided and to sys.stdout otherwise.

The maximal depth of recursion is limited by depth unless it is None. depth=0 means no recursion.

Changed in version 3.4: Added file parameter.

Changed in version 3.7: Implemented recursive disassembling and added depth parameter.

Changed in version 3.7: This can now handle coroutine and asynchronous generator objects.

https://docs.python.org/3/library/dis.html#dis.dis