如何在python中获取给定源文件和相关源文件的哈希值?

时间:2016-02-22 11:27:44

标签: python pickle

我有一个源文件,比如a.py,它会导入b.py和一些内置模块。 b.py可能会进一步导入c.pyd.py等。在a.py中,会有一个生成对象的慢速操作。为了提高速度,我使用pickle模块转储生成的对象并加载(如果之前已生成的话)。

但是,如果修改了abcd源代码中的任何一个,则应再次生成该对象。为了避免每次手动删除pickle文件,我想计算源的哈希值并将其写入pickle文件。所以我可以检查哈希码并决定是否生成对象。

如何编写一个函数,如果给定a.py,它将以递归方式查找b.pyc.pyd.py,并一起计算哈希码?

有没有更好的方法来解决这个问题?

1 个答案:

答案 0 :(得分:1)

你可以这样做:

import sys
for name, module in sys.modules.iteritems():
    # module object is a representation of imported module
    try:
        # Next line shows how to access path of Python module file
        module.__file__ 
    except AttributeError:
        '''
        Built-in modules and other special cases don't have __file__
        attribute but you shouldn't care about them as their behaviour won't change
        '''

现在您可以计算并比较您的哈希值。但老实说,我会说比较文件修改时间足够而且比计算哈希值便宜。

在我的案例中print sys.modules的输出是:

{'copy_reg': <module 'copy_reg' from '/usr/lib64/python2.7/copy_reg.pyc'>, 'sre_compile': <module 'sre_compile' from '/usr/lib64/python2.7/sre_compile.pyc'>, '_sre': <module '_sre' (built-in)>, 'encodings': <module 'encodings' from '/usr/lib64/python2.7/encodings/__init__.pyc'>, 'site': <module 'site' from '/usr/lib64/python2.7/site.pyc'>, '__builtin__': <module '__builtin__' (built-in)>, 'sysconfig': <module 'sysconfig' from '/usr/lib64/python2.7/sysconfig.pyc'>, 'atexit': <module 'atexit' from '/usr/lib64/python2.7/atexit.pyc'>, '__main__': <module '__main__' (built-in)>, 'encodings.encodings': None, 'abc': <module 'abc' from '/usr/lib64/python2.7/abc.pyc'>, 'posixpath': <module 'posixpath' from '/usr/lib64/python2.7/posixpath.pyc'>, '_weakrefset': <module '_weakrefset' from '/usr/lib64/python2.7/_weakrefset.pyc'>, 'errno': <module 'errno' (built-in)>, 'encodings.codecs': None, 'sre_constants': <module 'sre_constants' from '/usr/lib64/python2.7/sre_constants.pyc'>, 're': <module 're' from '/usr/lib64/python2.7/re.pyc'>, '_abcoll': <module '_abcoll' from '/usr/lib64/python2.7/_abcoll.pyc'>, 'types': <module 'types' from '/usr/lib64/python2.7/types.pyc'>, '_codecs': <module '_codecs' (built-in)>, 'encodings.__builtin__': None, '_warnings': <module '_warnings' (built-in)>, 'genericpath': <module 'genericpath' from '/usr/lib64/python2.7/genericpath.pyc'>, 'stat': <module 'stat' from '/usr/lib64/python2.7/stat.pyc'>, 'zipimport': <module 'zipimport' (built-in)>, '_sysconfigdata': <module '_sysconfigdata' from '/usr/lib64/python2.7/_sysconfigdata.pyc'>, 'warnings': <module 'warnings' from '/usr/lib64/python2.7/warnings.pyc'>, 'UserDict': <module 'UserDict' from '/usr/lib64/python2.7/UserDict.pyc'>, 'encodings.utf_8': <module 'encodings.utf_8' from '/usr/lib64/python2.7/encodings/utf_8.pyc'>, 'sys': <module 'sys' (built-in)>, 'codecs': <module 'codecs' from '/usr/lib64/python2.7/codecs.pyc'>, 'readline': <module 'readline' from '/usr/lib64/python2.7/lib-dynload/readline.so'>, 'os.path': <module 'posixpath' from '/usr/lib64/python2.7/posixpath.pyc'>, '_locale': <module '_locale' from '/usr/lib64/python2.7/lib-dynload/_locale.so'>, 'rlcompleter': <module 'rlcompleter' from '/usr/lib64/python2.7/rlcompleter.pyc'>, 'signal': <module 'signal' (built-in)>, 'traceback': <module 'traceback' from '/usr/lib64/python2.7/traceback.pyc'>, 'linecache': <module 'linecache' from '/usr/lib64/python2.7/linecache.pyc'>, 'posix': <module 'posix' (built-in)>, 'encodings.aliases': <module 'encodings.aliases' from '/usr/lib64/python2.7/encodings/aliases.pyc'>, 'exceptions': <module 'exceptions' (built-in)>, 'sre_parse': <module 'sre_parse' from '/usr/lib64/python2.7/sre_parse.pyc'>, 'os': <module 'os' from '/usr/lib64/python2.7/os.pyc'>, '_weakref': <module '_weakref' (built-in)>}

修改

怎么样:

import sys
import copy

# Save the dict of imported modules
modules_before = copy.copy(sys.modules)

# Import b, which will import c and d and so on
import b

for name, module in sys.modules.iteritems():
    if name in modules_before:
        # Skip irrelevant modules
        continue
    # module object is a representation of imported module
    try:
        # Next line shows how to access path of Python module file
        module.__file__ 
    except AttributeError:
        '''
        Built-in modules and other special cases don't have __file__
        attribute but you shouldn't care about them as their behaviour won't change
        '''