我的笔记本电脑中保存了一组文件。文件夹结构如下:
Part1(folder)
Part1(subfolder)
awards_1990 (subfolder)
awards_1990_00 (subfolder)
(files)
awards_1990_01
(files)
...
...
...
awards_1991
awards_1991_01
(files)
awards_1991_01
awards_1991_01
...
...
...
awards_1992
...
...
...
awards_1993
...
...
...
awards_1994
...
...
...
所以我试图用os.walk提取文件路径列表。我的代码是这样的:
import os
matches=[]
for root, dirnames, dirname in os.walk('E:\\Grad\\LIS\\LIS590 Text mining\\Part1\\Part1'):
for dirname in dirnames:
for filename in dirname:
if filename.endswith(('.txt','.html','.pdf')):
matches.append(os.path.join(root,filename))
当我调用匹配项时,它会返回[]。
我尝试了另一个代码:
import os
dirnames=os.listdir('E:\\Grad\\LIS\\LIS590 Text mining\\Part1\\Part1')
for filenames in dirnames:
for filename in filenames:
path=os.path.join(filename)
print (os.path.abspath(path))
这个给了我这个结果:
C:\Python32\a
C:\Python32\w
C:\Python32\a
C:\Python32\r
C:\Python32\d
C:\Python32\s
C:\Python32\_
C:\Python32\1
...
研究此错误。不知道该怎么办?
答案 0 :(得分:3)
函数以结尾:后缀[,start [,end]] ,所以如果你有多个后缀,那么你需要围绕它们的括号:
if filename.endswith(('.txt','.html','.pdf')):
答案 1 :(得分:0)
for filename in dirname:
枚举dirname
字符串中的各个字符。尝试:
#!/usr/bin/env python
import os
topdir = r'E:\Grad\LIS\LIS590 Text mining\Part1\Part1'
matches = []
for root, dirnames, filenames in os.walk(topdir):
for filename in filenames:
if filename.endswith(('.txt','.html','.pdf')):
matches.append(os.path.join(root, filename))
print("\n".join(matches))
这里不需要for
- 循环dirnames
。