我需要帮助才能理解为什么这段代码没有按预期工作。
我的目录结构如下所示:
|- tryWalkDir.py
TryCPP/
TryCPP/tryHashMap/
TryCPP/tryHashMap/tryHashMap.cpp
TryCPP/tryHashMap/tryHashMap.o*
脚本 - tryWalkDir.py旨在搜索所有.cpp文件。我不知道为什么
[TryCPP / tryHashMap / tryHashMap.cpp,TryCPP / tryHashMap / tryHashMap.cpp]收集2次?
Enter into depth:0, folder:TryCPP
folder:TryCPP, cur:TryCPP, sub:['tryHashMap'], files:[], depth:0
recresively call - s:tryHashMap
Enter into depth:1, folder:TryCPP/tryHashMap
folder:TryCPP/tryHashMap, cur:TryCPP/tryHashMap, sub:[], files:
['tryHashMap.cpp', 'tryHashMap.o'], depth:1
process tryHashMap.cpp
append tryHashMap.cpp
process tryHashMap.o
Exit on depth:1, folder:TryCPP/tryHashMap
folder:TryCPP, cur:TryCPP/tryHashMap, sub:[], files:['tryHashMap.cpp', 'tryHashMap.o'], depth:0
process tryHashMap.cpp
append tryHashMap.cpp
process tryHashMap.o
Exit on depth:0, folder:TryCPP
['TryCPP/tryHashMap/tryHashMap.cpp', 'TryCPP/tryHashMap/tryHashMap.cpp']
class Cell(object):
def __init__(self, fn, ext):
self.fn = fn
self.ext = ext
self.fl = [] #list all the files
def collect_files(self, folder, depth=0):
''' collect all the folders containing corresponding extension scripts '''
print 'Enter into depth:%d, folder:%s' % (depth,folder)
# level one folder name should start with 'Try' or 'try'
if depth == 1:
filename = os.path.basename(folder)[:3]
if filename in ['Try','try']:
pass
else:
print 'L1 Dir - {0} must start with [Try,try], depth:{1}'.format(filename,depth)
return
for cur, sub, files in os.walk(folder):
print 'folder:{}, cur:{}, sub:{}, files:{}, depth:{}'.format(folder,cur,sub,files,depth)
#filter out all the files
#[ self.fl.append(cur+'/'+f) for f in files if os.path.splitext(f)[1][1:] == self.ext ]
for f in files:
print 'process %s' % f
if os.path.splitext(f)[1][1:] == self.ext:
print 'append %s' % f
self.fl.append(cur+'/'+f)
#if sub:
for s in sub:
print 'recresively call - s:{}'.format(s)
self.collect_files(cur+'/'+s,depth+1)
print 'Exit on depth:%d, folder:%s' % (depth,folder)
def start(self):
self.collect_files(self.fn,0)
#print self.fl
def main():
cell = Cell('TryCPP','cpp')
cell.start()
print cell.fl
if __name__ == '__main__': main()
答案 0 :(得分:0)
错误正在发生,因为您多次调用os.walk
而未意识到这一点。 os.walk
将您递归到子目录中。但是,然后为当前目录中的每个子目录调用self.collect_files(cur+'/'+s,depth+1)
。这实际上会导致深度为N
的文件在输出数组中出现N
次。
要修复代码,只需删除循环
即可for s in sub:
print 'recresively call - s:{}'.format(s)
self.collect_files(cur+'/'+s,depth+1)
顺便说一下,您应该使用os.path.join
而不是在整个代码中手动连接斜杠。例如,self.fl.append(cur+'/'+f)
可以阅读self.fl.append(join(cur, f))
。这是os.walk
文档建议的方式:
要获取dirpath中文件或目录的完整路径(以top开头),请执行
os.path.join(dirpath, name)
。