用装饰器替换宏样式类方法?

时间:2012-07-25 23:07:35

标签: python decorator with-statement contextmanager

尽管已经阅读了很多关于这个主题的文章(包括[这个] [1]非常受欢迎的文章),但我很难掌握装饰器。我怀疑我一定是个傻瓜,但由于愚蠢的所有顽固,我决定试图解决这个问题。

那,我怀疑我的用例很好......

以下是我的一个项目中的一些代码,它从PDF文件中提取文本。处理涉及三个步骤:

  1. 设置处理PDF文件所需的PDFMiner对象(样板初始化)。
  2. 将处理功能应用于PDF文件。
  3. 无论发生什么,请关闭文件。
  4. 我最近了解了上下文管理器和with语句,这对他们来说似乎是一个很好的用例。因此,我开始定义PDFMinerWrapper类:

    class PDFMinerWrapper(object):
        '''
        Usage:
        with PDFWrapper('/path/to/file.pdf') as doc:
            doc.dosomething()
        '''
        def __init__(self, pdf_doc, pdf_pwd=''):
            self.pdf_doc = pdf_doc
            self.pdf_pwd = pdf_pwd
    
        def __enter__(self):
            self.pdf = open(self.pdf_doc, 'rb')
            parser = PDFParser(self.pdf)  # create a parser object associated with the file object
            doc = PDFDocument()  # create a PDFDocument object that stores the document structure
            parser.set_document(doc)  # connect the parser and document objects
            doc.set_parser(parser)
            doc.initialize(self.pdf_pwd)  # pass '' if no password required
            return doc
    
        def __exit__(self, type, value, traceback):
            self.pdf.close()
            # if we have an error, catch it, log it, and return the info
            if isinstance(value, Exception):
                self.logError()
                print traceback
                return value
    

    现在我可以轻松使用PDF文件,并确保它能够优雅地处理错误。理论上,我需要做的就是这样:

    with PDFMinerWrapper('/path/to/pdf') as doc:
        foo(doc)
    

    这很好,除了我需要检查PDF文档是否可以在之前>将函数应用于PDFMinerWrapper返回的对象。我目前的解决方案涉及一个中间步骤。

    我正在使用我称为Pamplemousse的类,该类用作处理PDF的界面。反过来,每次必须对对象所链接的文件执行操作时,它都会使用PDFMinerWrapper

    以下是一些演示其用法的(删节)代码:

    class Pamplemousse(object):
        def __init__(self, inputfile, passwd='', enc='utf-8'):
            self.pdf_doc = inputfile
            self.passwd = passwd
            self.enc = enc
    
        def with_pdf(self, fn, *args):
            result = None
            with PDFMinerWrapper(self.pdf_doc, self.passwd) as doc:
                if doc.is_extractable:  # This is the test I need to perform
                    # apply function and return result
                    result = fn(doc, *args)
    
            return result
    
        def _parse_toc(self, doc):
            toc = []
            try:
                toc = [(level, title) for level, title, dest, a, se in doc.get_outlines()]
            except PDFNoOutlines:
                pass
            return toc
    
        def get_toc(self):
            return self.with_pdf(self._parse_toc)
    

    每当我希望对PDF文件执行操作时,我都会将相关函数及其参数传递给with_pdf方法。反过来,with_pdf方法使用with语句来利用PDFMinerWrapper的上下文管理器(从而确保正常处理异常)并在实际应用函数之前执行检查。过去了。

    我的问题如下:

    我想简化此代码,以便我不必显式调用Pamplemousse.with_pdf。我的理解是装饰者可以在这里提供帮助,所以:

    1. 我如何实现一个装饰器,其作用是调用with语句并执行可提取性检查?
    2. 装饰器可能是一个类方法,还是我的装饰器必须是一个自由形式的函数或类?

4 个答案:

答案 0 :(得分:1)

我解释你的目标的方式是能够在你的Pamplemousse类上定义多个方法,而不是经常在那个调用中包装它们。这是一个非常简化的版本:

def if_extractable(fn):
    # this expects to be wrapping a Pamplemousse object
    def wrapped(self, *args):
        print "wrapper(): Calling %s with" % fn, args
        result = None
        with PDFMinerWrapper(self.pdf_doc) as doc:
            if doc.is_extractable:
                result = fn(self, doc, *args)
        return result
    return wrapped


class Pamplemousse(object):

    def __init__(self, inputfile):
        self.pdf_doc = inputfile

    # get_toc will only get called if the wrapper check
    # passes the extractable test
    @if_extractable
    def get_toc(self, doc, *args):
        print "get_toc():", self, doc, args

定义的装饰器if_extractable只是一个函数,但它希望在类的实例方法中使用。

用于委托私有方法的装饰get_toc,如果通过检查,只会期望接收doc对象和args。否则它不会被调用,包装器返回None。

通过这种方式,您可以继续定义操作函数以期望doc

你甚至可以添加一些类型检查以确保它包装预期的类:

def if_extractable(fn):
    def wrapped(self, *args):
    if not hasattr(self, 'pdf_doc'):
        raise TypeError('if_extractable() is wrapping '\
                        'a non-Pamplemousse object')
    ...

答案 1 :(得分:0)

装饰器只是一个函数,它接受一个函数并返回另一个函数。你可以做任何你喜欢的事情:

def my_func():
    return 'banana'

def my_decorator(f): # see it takes a function as an argument
    def wrapped():
        res = None
        with PDFMineWrapper(pdf_doc, passwd) as doc:
            res = f()
        return res
     return wrapper # see, I return a function that also calls f

现在,如果您应用装饰器:

@my_decorator
def my_func():
    return 'banana'

wrapped函数将替换my_func,因此将调用额外的代码。

答案 2 :(得分:0)

您可能想尝试这样做:

def with_pdf(self, fn, *args):
    def wrappedfunc(*args):
        result = None
        with PDFMinerWrapper(self.pdf_doc, self.passwd) as doc:
            if doc.is_extractable:  # This is the test I need to perform
                # apply function and return result
                result = fn(doc, *args)
        return result
    return wrappedfunc

当你需要包装函数时,只需执行以下操作:

@pamplemousseinstance.with_pdf
def foo(doc, *args):
    print 'I am doing stuff with', doc
    print 'I also got some good args. Take a look!', args

答案 3 :(得分:0)

以下是一些演示代码:

#! /usr/bin/python

class Doc(object):
    """Dummy PDFParser Object"""

    is_extractable = True
    text = ''

class PDFMinerWrapper(object):
    '''
    Usage:
    with PDFWrapper('/path/to/file.pdf') as doc:
        doc.dosomething()
    '''
    def __init__(self, pdf_doc, pdf_pwd=''):
        self.pdf_doc = pdf_doc
        self.pdf_pwd = pdf_pwd

    def __enter__(self):
        return self.pdf_doc

    def __exit__(self, type, value, traceback):
        pass

def safe_with_pdf(fn):
    """
    This is the decorator, it gets passed the fn we want
    to decorate.

    However as it is also a class method it also get passed
    the class. This appears as the first argument and the
    function as the second argument.
    """
    print "---- Decorator ----"
    print "safe_with_pdf: First arg (fn):", fn
    def wrapper(self, *args, **kargs):
        """
        This will get passed the functions arguments and kargs,
        which means that we can intercept them here.
        """
        print "--- We are now in the wrapper ---"
        print "wrapper: First arg (self):", self
        print "wrapper: Other args (*args):", args
        print "wrapper: Other kargs (**kargs):", kargs

        # This function is accessible because this function is
        # a closure, thus still has access to the decorators
        # ivars.
        print "wrapper: The function we run (fn):", fn

        # This wrapper is now pretending to be the original function

        # Perform all the checks and stuff
        with PDFMinerWrapper(self.pdf, self.passwd) as doc:
            if doc.is_extractable:
                # Now call the orininal function with its
                # argument and pass it the doc
                result = fn(doc, *args, **kargs)
            else:
                result = None
        print "--- End of the Wrapper ---"
        return result

    # Decorators are expected to return a function, this
    # function is then run instead of the decorated function.
    # So instead of returning the original function we return the
    # wrapper. The wrapper will be run with the original functions
    # argument.

    # Now by using closures we can still access the original
    # functions by looking up fn (the argument that was passed
    # to this function) inside of the wrapper.
    print "--- Decorator ---"
    return wrapper


class SomeKlass(object):

    @safe_with_pdf
    def pdf_thing(doc, some_argument):
        print ''
        print "-- The Function --"

        # This function is now passed the doc from the wrapper.

        print 'The contents of the pdf:', doc.text
        print 'some_argument', some_argument
        print "-- End of the Function --"
        print ''

doc = Doc()
doc.text = 'PDF contents'
klass = SomeKlass()  
klass.pdf = doc
klass.passwd = ''
klass.pdf_thing('arg')

我建议运行该代码以查看其工作原理。一些有趣的观点要注意:

首先你会注意到我们只将一个参数传递给pdf_thing(),但是如果你看一下这个方法就需要两个参数:

@safe_with_pdf
def pdf_thing(doc, some_argument):
    print ''
    print "-- The Function --"

这是因为如果你看一下我们所有函数的包装器:

with PDFMinerWrapper(self.pdf, self.passwd) as doc:
    if doc.is_extractable:
        # Now call the orininal function with its
        # argument and pass it the doc
        result = fn(doc, *args, **kargs)

我们生成doc参数并将其与原始参数(*args, **kargs)一起传递。这意味着除了声明(doc)中列出的参数之外,使用此装饰器包装的每个方法或函数都会收到一个加法def pdf_thing(doc, some_argument):参数。

需要注意的另一件事是包装器:

def wrapper(self, *args, **kargs):
    """
    This will get passed the functions arguments and kargs,
    which means that we can intercept them here.
    """

还捕获self参数,但不将其传递给被调用的方法。您可以通过以下方式更改此行为:

result = fn(doc, *args, **kargs)
    else:
        result = None

要:

result = fn(self, doc, *args, **kargs)
    else:
        result = None

然后将方法本身更改为:

def pdf_thing(self, doc, some_argument):

希望有所帮助,请随时要求更多澄清。

编辑:

回答你问题的第二部分。

是的可以是一种类方法。只需将safe_with_pdf放在SomeKlass 上方内并调用它,例如班上的第一个方法。

此处还有上述代码的简化版本,以及类中的装饰器。

class SomeKlass(object):
    def safe_with_pdf(fn):
        """The decorator which will wrap the method"""
        def wrapper(self, *args, **kargs):
            """The wrapper which will call the method is a doc"""
            with PDFMinerWrapper(self.pdf, self.passwd) as doc:
                if doc.is_extractable:
                    result = fn(doc, *args, **kargs)
                else:
                    result = None
            return result
        return wrapper

    @safe_with_pdf
    def pdf_thing(doc, some_argument):
        """The method to decorate"""
        print 'The contents of the pdf:', doc.text
        print 'some_argument', some_argument
        return '%s - Result' % doc.text

print klass.pdf_thing('arg')