我有一个源文件,比如a.py
,它会导入b.py
和一些内置模块。 b.py
可能会进一步导入c.py
和d.py
等。在a.py
中,会有一个生成对象的慢速操作。为了提高速度,我使用pickle
模块转储生成的对象并加载(如果之前已生成的话)。
但是,如果修改了a
,b
,c
,d
源代码中的任何一个,则应再次生成该对象。为了避免每次手动删除pickle文件,我想计算源的哈希值并将其写入pickle文件。所以我可以检查哈希码并决定是否生成对象。
如何编写一个函数,如果给定a.py
,它将以递归方式查找b.py
,c.py
和d.py
,并一起计算哈希码?
有没有更好的方法来解决这个问题?
答案 0 :(得分:1)
你可以这样做:
import sys
for name, module in sys.modules.iteritems():
# module object is a representation of imported module
try:
# Next line shows how to access path of Python module file
module.__file__
except AttributeError:
'''
Built-in modules and other special cases don't have __file__
attribute but you shouldn't care about them as their behaviour won't change
'''
现在您可以计算并比较您的哈希值。但老实说,我会说比较文件修改时间足够而且比计算哈希值便宜。
在我的案例中print sys.modules
的输出是:
{'copy_reg': <module 'copy_reg' from '/usr/lib64/python2.7/copy_reg.pyc'>, 'sre_compile': <module 'sre_compile' from '/usr/lib64/python2.7/sre_compile.pyc'>, '_sre': <module '_sre' (built-in)>, 'encodings': <module 'encodings' from '/usr/lib64/python2.7/encodings/__init__.pyc'>, 'site': <module 'site' from '/usr/lib64/python2.7/site.pyc'>, '__builtin__': <module '__builtin__' (built-in)>, 'sysconfig': <module 'sysconfig' from '/usr/lib64/python2.7/sysconfig.pyc'>, 'atexit': <module 'atexit' from '/usr/lib64/python2.7/atexit.pyc'>, '__main__': <module '__main__' (built-in)>, 'encodings.encodings': None, 'abc': <module 'abc' from '/usr/lib64/python2.7/abc.pyc'>, 'posixpath': <module 'posixpath' from '/usr/lib64/python2.7/posixpath.pyc'>, '_weakrefset': <module '_weakrefset' from '/usr/lib64/python2.7/_weakrefset.pyc'>, 'errno': <module 'errno' (built-in)>, 'encodings.codecs': None, 'sre_constants': <module 'sre_constants' from '/usr/lib64/python2.7/sre_constants.pyc'>, 're': <module 're' from '/usr/lib64/python2.7/re.pyc'>, '_abcoll': <module '_abcoll' from '/usr/lib64/python2.7/_abcoll.pyc'>, 'types': <module 'types' from '/usr/lib64/python2.7/types.pyc'>, '_codecs': <module '_codecs' (built-in)>, 'encodings.__builtin__': None, '_warnings': <module '_warnings' (built-in)>, 'genericpath': <module 'genericpath' from '/usr/lib64/python2.7/genericpath.pyc'>, 'stat': <module 'stat' from '/usr/lib64/python2.7/stat.pyc'>, 'zipimport': <module 'zipimport' (built-in)>, '_sysconfigdata': <module '_sysconfigdata' from '/usr/lib64/python2.7/_sysconfigdata.pyc'>, 'warnings': <module 'warnings' from '/usr/lib64/python2.7/warnings.pyc'>, 'UserDict': <module 'UserDict' from '/usr/lib64/python2.7/UserDict.pyc'>, 'encodings.utf_8': <module 'encodings.utf_8' from '/usr/lib64/python2.7/encodings/utf_8.pyc'>, 'sys': <module 'sys' (built-in)>, 'codecs': <module 'codecs' from '/usr/lib64/python2.7/codecs.pyc'>, 'readline': <module 'readline' from '/usr/lib64/python2.7/lib-dynload/readline.so'>, 'os.path': <module 'posixpath' from '/usr/lib64/python2.7/posixpath.pyc'>, '_locale': <module '_locale' from '/usr/lib64/python2.7/lib-dynload/_locale.so'>, 'rlcompleter': <module 'rlcompleter' from '/usr/lib64/python2.7/rlcompleter.pyc'>, 'signal': <module 'signal' (built-in)>, 'traceback': <module 'traceback' from '/usr/lib64/python2.7/traceback.pyc'>, 'linecache': <module 'linecache' from '/usr/lib64/python2.7/linecache.pyc'>, 'posix': <module 'posix' (built-in)>, 'encodings.aliases': <module 'encodings.aliases' from '/usr/lib64/python2.7/encodings/aliases.pyc'>, 'exceptions': <module 'exceptions' (built-in)>, 'sre_parse': <module 'sre_parse' from '/usr/lib64/python2.7/sre_parse.pyc'>, 'os': <module 'os' from '/usr/lib64/python2.7/os.pyc'>, '_weakref': <module '_weakref' (built-in)>}
修改强>
怎么样:
import sys
import copy
# Save the dict of imported modules
modules_before = copy.copy(sys.modules)
# Import b, which will import c and d and so on
import b
for name, module in sys.modules.iteritems():
if name in modules_before:
# Skip irrelevant modules
continue
# module object is a representation of imported module
try:
# Next line shows how to access path of Python module file
module.__file__
except AttributeError:
'''
Built-in modules and other special cases don't have __file__
attribute but you shouldn't care about them as their behaviour won't change
'''