我有一个像这样的文件夹列表:
u'Magazines/testfolder1',
u'Magazines/testfolder1/folder1/folder2/folder3',
u'Magazines/testfolder1/folder1/',
u'Magazines/testfolder1/folder1/folder2/',
u'Magazines/testfolder2',
u'Magazines/testfolder2/folder1/folder2/folder3',
u'Magazines/testfolder2/folder1/',
u'Magazines/testfolder2/folder1/folder2/',
u'Magazines/testfolder3',
u'Magazines/testfolder3/folder1/folder2/folder3',
u'Magazines/testfolder3/folder1/',
u'Magazines/testfolder3/folder1/folder2/',
现在我想要的只是父文件夹的列表。
即在上面的示例中,我希望将其缩减为:
u'Magazines/testfolder1',
u'Magazines/testfolder2',
u'Magazines/testfolder3',
因为它们都包含子文件夹。
我在My database中递归添加文件夹,所以如果我有testfolder1
,那么脚本将自动递归其子文件夹。因此,如果父级也在列表中,我不需要列表中的子文件夹。
我该怎么做?
答案 0 :(得分:2)
使用set:
>>> list_of_folders = [
... u'Magazines/testfolder1',
... u'Magazines/testfolder1/folder1/folder2/folder3',
... u'Magazines/testfolder1/folder1/',
... u'Magazines/testfolder1/folder1/folder2/',
... u'Magazines/testfolder2',
... u'Magazines/testfolder2/folder1/folder2/folder3',
... u'Magazines/testfolder2/folder1/',
... u'Magazines/testfolder2/folder1/folder2/',
... u'Magazines/testfolder3',
... u'Magazines/testfolder3/folder1/folder2/folder3',
... u'Magazines/testfolder3/folder1/',
... u'Magazines/testfolder3/folder1/folder2/',
... ]
>>> result = set()
>>> for folder in list_of_folders:
... for parent in result:
... if folder.startswith(parent):
... break
... else:
... result.add(folder)
...
>>> result
{'Magazines/testfolder3', 'Magazines/testfolder2', 'Magazines/testfolder1'}
<强>更新强>
list_of_folders = [
...
]
result = set()
for folder in list_of_folders:
if all(not folder.startswith(parent) for parent in result):
result.add(folder)
print result
答案 1 :(得分:0)
如何使用regular expression。
import re
l = [
u'Magazines/testfolder1',
u'Magazines/testfolder1/folder1/folder2/folder3',
u'Magazines/testfolder1/folder1/',
u'Magazines/testfolder1/folder1/folder2/',
u'Magazines/testfolder2',
u'Magazines/testfolder2/folder1/folder2/folder3',
u'Magazines/testfolder2/folder1/',
u'Magazines/testfolder2/folder1/folder2/',
u'Magazines/testfolder3',
u'Magazines/testfolder3/folder1/folder2/folder3',
u'Magazines/testfolder3/folder1/',
u'Magazines/testfolder3/folder1/folder2/',
]
expect = [
u'Magazines/testfolder1',
u'Magazines/testfolder2',
u'Magazines/testfolder3',
]
result = filter(lambda x: re.match('^[^\/]+\/[^\/]+$', x), l)
assert expect == result
答案 2 :(得分:0)
下面的Mate Ileive是您正在寻找的解决方案
lst = [
u'Magazines/testfolder1',
u'Magazines/testfolder1/folder1/folder2/folder3',
u'Magazines/testfolder1/folder1/',
u'Magazines/testfolder1/folder1/folder2/',
u'Magazines/testfolder2',
u'Magazines/testfolder2/folder1/folder2/folder3',
u'Magazines/testfolder2/folder1/',
u'Magazines/testfolder2/folder1/folder2/',
u'Magazines/testfolder3',
u'Magazines/testfolder3/folder1/folder2/folder3',
u'Magazines/testfolder3/folder1/',
u'Magazines/testfolder3/folder1/folder2/'
]
for x in lst:
for y in lst[:]:
if x in y and len(x)<len(y):
lst.remove(y)
print lst
<强>输出强>
[u'Magazines/testfolder1', u'Magazines/testfolder2', u'Magazines/testfolder3']
此程序会迭代地从列表中删除子文件夹,只留下父文件夹。
答案 3 :(得分:0)
l =[u'Magazines/testfolder1',
u'Magazines/testfolder1/folder1/folder2/folder3',
u'Magazines/testfolder1/folder1/',
u'Magazines/testfolder1/folder1/folder2/',
u'Magazines/testfolder2',
u'Magazines/testfolder2/folder1/folder2/folder3',
u'Magazines/testfolder2/folder1/',
u'Magazines/testfolder2/folder1/folder2/',
u'Magazines/testfolder3',
u'Magazines/testfolder3/folder1/folder2/folder3',
u'Magazines/testfolder3/folder1/',
u'Magazines/testfolder3/folder1/folder2/', ]
mincount = min(s.count('/') for s in l)
[d for d in sorted(l) if d.count('/') <= mincount]
#=> [u'Magazines/testfolder1', u'Magazines/testfolder2', u'Magazines/testfolder3']
它并不过分聪明,但它适用于有共同根的地方。