我从github克隆了某人的代码并尝试运行但失败了。该错误指向此功能:
# processing files in a directory
# return {b:'d:f d:f ...', ...}
def proc_dir(dwid_dir):
bf = defaultdict(str)
for fname in sorted(os.listdir(dwid_dir), key=lambda d:int(d.split('.')[0])):
day_bf = bitermFreq(dwid_dir + fname)
for b, f in day_bf.items():
bf[b] += '%s:%d ' % (fname.split('.')[0], f)
return bf
具体来说,这行代码:
for fname in sorted(os.listdir(dwid_dir), key=lambda d:int(d.split('.')[0])):
所有文件'该目录中的名称格式为:{int number}.txt
。例如,0.txt
。
然而,它出现了错误:
Traceback (most recent call last):
File "bitermDayFreq.py", line 11, in proc_dir
for fname in sorted(os.listdir(dwid_dir), key=lambda d:int(d.split('.')[0])):
File "bitermDayFreq.py", line 11, in <lambda>
for fname in sorted(os.listdir(dwid_dir), key=lambda d:int(d.split('.')[0])):
ValueError: invalid literal for int() with base 10: ''
作者告诉我他可以成功运行此代码。我想知道这是否与编码问题有关。以及如何解决它?提前谢谢。
答案 0 :(得分:0)
有一个带有前导点(.
)的文件会导致d.split('.')
返回一个带有空字符串的列表作为第一项:
>>> '.hidden'.split('.')
['', 'hidden']
要跳过带有前导点的文件(UNIX中的隐藏文件),请替换以下行:
for fname in sorted(os.listdir(dwid_dir), key=lambda d:int(d.split('.')[0])):
使用:
for fname in sorted([fn for fn in os.listdir(dwid_dir) if not fn.startswith('.')],
key=lambda d:int(d.split('.')[0])):
>>> filenames = ['.hidden', '12.txt', '2.txt']
>>> sorted([fn for fn in filenames if not fn.startswith('.')],
key=lambda d:int(d.split('.')[0]))
['2.txt', '12.txt']