我正在运行Windows 10,Python 3.6和最新开发的PyInstaller。 我正在尝试部署使用重复数据删除模块的代码,并尝试使用Predicates运行Blocker时遇到非常普通的错误:
Traceback (most recent call last):
File "deduping.py", line 227, in <module>
File "deduping.py", line 223, in main
File "deduping.py", line 182, in dedupe
File "deduping.py", line 92, in create_distance_matrix_blocking_based
File "lib\site-packages\dedupe\blocking.py", line 45, in __call__
File "lib\site-packages\dedupe\predicates.py", line 300, in __call__
File "lib\site-packages\dedupe\predicates.py", line 300, in <listcomp>
File "lib\site-packages\dedupe\predicates.py", line 156, in __call__
File "lib\site-packages\dedupe\tfidf.py", line 36, in search
File "lib\site-packages\dedupe\canopy_index.py", line 61, in apply
AttributeError: 'IFBucket' object has no attribute 'byValue'
当我在普通的python中运行代码时,它可以工作。 我不知道哪个模块无法加载,我认为是zope.index,因为所有谓词和Blocker类都已加载。
我为重复数据删除模块添加了钩子:
from PyInstaller.utils.hooks import collect_all,collect_data_files
datas, binaries, hiddenimports = collect_all('dedupe')
并看到以下警告:
42736 INFO: Loading module hook "hook-dedupe.py"...
43816 INFO: Determining a mapping of distributions to packages...
53426 WARNING: Unable to find package for requirement dedupe-variable-datetime from package dedupe.
53426 WARNING: Unable to find package for requirement categorical-distance from package dedupe.
53426 WARNING: Unable to find package for requirement fastcluster from package dedupe.
53426 WARNING: Unable to find package for requirement dedupe-hcluster from package dedupe.
53426 WARNING: Unable to find package for requirement zope.index from package dedupe.
53426 WARNING: Unable to find package for requirement Levenshtein-search from package dedupe.
53426 INFO: Packages required by dedupe:
['simplecosine', 'highered', 'numpy', 'affinegap', 'BTrees', 'simplejson', 'future', 'doublemetaphone', 'rlr', 'haversine']
我试图将zope.index添加为隐藏的导入,但没有帮助。 我希望获得一些指导,因为不会出现有用的错误。