Python有效的方法来比较列表项与其他列表中的部分项

时间:2012-03-28 06:37:29

标签: python list

我有两个清单:

link_ids = ['111','222','333']
filenames = ['111-foo.txt','111-bar.txt','222.txt']

我想做两件事。首先,找到与链接ID匹配的文件名。其次,创建一个没有匹配文件的链接ID列表。

它非常简单,但它正在努力!这显然没有达到应有的水平,但它是我能想到的最好的:

missing = []
for i in link_ids:
    for f in filenames:
        if i in f:
            print 'match found'
        else:
            missing.append(i)

如果可以,请帮忙!

4 个答案:

答案 0 :(得分:1)

namedtuple非常适合这个问题 它为您提供了命名属性,而没有与(非优化)类相关的额外开销。

import collections, os

link_ids = ['111','222','333']
filenames = ['111-foo.txt','111-bar.txt','222.txt']
File = collections.namedtuple("File", "fname fext") # named-tuple set-up

files = {File(*os.path.splitext(f)) for f in filenames}
# -> set([File(fname='222', fext='.txt'), 
#         File(fname='111-bar', fext='.txt'), 
#         File(fname='111-foo', fext='.txt')])

“首先,找到与链接ID匹配的文件名。”:

matched = [f for f in files if f.fname in link_ids]
# -> [File(fname='222', fext='.txt')]

“其次,创建一个没有匹配文件的链接ID列表。”:

unmatched = [l for l in link_ids if l not in {getattr(f,'fname') for f in files}]
# -> ['111', '333']

在评论中,您提到匹配后需要完整的文件名 为此你可以这样做:

matched_filenames = [f.fname + f.fext for f in matched]
# -> ['222.txt']

答案 1 :(得分:1)

我刚刚开始学习python,但我会试一试......

也许你可以使用set设施?

>>> file_set = {i[:-4] for i in filenames}
>>> matched_links = set(link_ids) & file_set
>>> unmatched_links = set(link_ids) - file_set

答案 2 :(得分:0)

首先列出所有文件名ID但没有'.txt'扩展名:

>>> link_ids = ['111','222','333']
>>> filenames = ['111.txt','222.txt']
>>> filename_ids = [i[:-4] for i in filenames]
>>> filename_ids
['111', '222']

然后您可以创建两个列表:匹配的ID和不匹配的ID:

>>> match_ids = [i for i in link_ids if i in filename_ids]
>>> match_ids
['111', '222']
>>> not_match_ids = [i for i in link_ids if i not in filename_ids]
>>> not_match_ids
['333']

答案 3 :(得分:0)

link_ids = ['111','222','333']
filenames = ['111-foo.txt','111-bar.txt','222.txt']

missing = []
found = []
for i in link_ids:
    for f in filenames:
        if i in f:
            print 'match found'
            found.append(i)

missing = list(set(link_ids) - set(found))
print 'Missing link ids: ', missing

<强>输出:

match found
match found
match found
Missing link ids:  ['333']