如何比较2列表,其中字符串匹配备用列表中的元素

时间:2015-05-03 23:58:02

标签: python list python-2.7

您好我正在学习,所以您可能不得不忍受我。我有2个列表我想比较,同时保留任何匹配并附加它们,同时将任何非匹配添加到另一个输出列表。 继承我的代码:

def EntryToFieldMatch(Entry, Fields):
    valid = []
    invalid = []
    for c in Entry:
        count = 0
        for s in Fields:
            count +=1
            if s in c:
                valid.append(c)
            elif count == len(Entry):
                invalid.append(s)
                Fields.remove(s)



    print valid
    print "-"*50
    print invalid


def main():
    vEntry = ['27/04/2014', 'Hours = 28', 'Site = Abroad', '03/05/2015', 'Date = 28-04-2015', 'Travel = 2']
    Fields = ['Week_Stop', 'Date', 'Site', 'Hours', 'Travel', 'Week_Start', 'Letters']
    EntryToFieldMatch(vEntry, Fields)

if __name__ = "__main__":
    main()

输出看起来很好,除了它没有返回2个输出列表中的所有字段。这是我收到的输出:

['Hours = 28', 'Site = Abroad', 'Date = 28-04-2015', 'Travel = 2']
--------------------------------------------------
['Week_Start', 'Letters']

我只是不知道为什么第二个列表不包括" Week_Stop"。我已经运行了调试器并且几次跟随代码无济于事。我已经阅读了关于集合但我没有看到任何方法返回匹配的字段并丢弃不合适的字段。 如果有人知道简化整个过程的方法,我也可以提出建议,我不是要求免费代码,只是朝着正确的方向点头。 Python 2.7,谢谢

2 个答案:

答案 0 :(得分:0)

你只有两个条件,要么是在字符串中,要么是count等于Entry的长度,它们都没有捕获第一个元素'Week_Stop',长度从7-6-5捕获{ {1}}但永远不会到达Week_Start,因此您永远无法访问0

如果您想保留订单,更有效的方法是使用集合或collections.OrderedDict

Week_Stop

输出:

from collections import OrderedDict
def EntryToFieldMatch(Entry, Fields):
    valid = []
    # create orderedDict from the words in Fields
    # dict lookups are 0(1)
    st = OrderedDict.fromkeys(Fields)
    # iterate over Entry
    for word in Entry:
        # split the words once on whitespace
        spl = word.split(None, 1)
        # if the first word/word appears in our dict keys
        if spl[0] in st:
            # add to valid list
            valid.append(word)
            # remove the key
            del st[spl[0]]
    print valid
    print "-"*50
    # only invalid words will be left
    print st.keys()

对于大型列表,这将比您的二次方法快得多。拥有['Hours = 28', 'Site = Abroad', 'Date = 28-04-2015', 'Travel = 2'] -------------------------------------------------- ['Week_Stop', 'Week_Start', 'Letters'] 字典查找意味着每次执行0(1) in Fields操作时,代码都会从二次变为线性。

使用set方法类似:

0(n)

使用集合的差异是不保持顺序。

答案 1 :(得分:-1)

使用list comprehension

def EntryToFieldMatch(Entries, Fields):

    # using list comprehension 
    # (typically they go on one line, but they can be multiline 
    #  so they look more like their for loop equivalents)
    valid = [entry for entry in Entries
                 if any([field in entry 
                         for field in Fields])]

    invalidEntries = [entry for entry in Entries 
                          if not any([field in entry 
                                      for field in Fields])]

    missedFields = [field for field in Fields
                          if not any([field in entry 
                                      for entry in Entries])]

    print 'valid entries:', valid
    print '-' * 80
    print 'invalid entries:', invalidEntries
    print '-' * 80
    print 'missed fields:', missedFields

vEntry = ['27/04/2014', 'Hours = 28', 'Site = Abroad', '03/05/2015', 'Date = 28-04-2015', 'Travel = 2']
Fields = ['Week_Stop', 'Date', 'Site', 'Hours', 'Travel', 'Week_Start', 'Letters']
EntryToFieldMatch(vEntry, Fields)
valid entries: ['Hours = 28', 'Site = Abroad', 'Date = 28-04-2015', 'Travel = 2']
--------------------------------------------------------------------------------
invalid entries: ['27/04/2014', '03/05/2015']
--------------------------------------------------------------------------------
missed fields: ['Week_Stop', 'Week_Start', 'Letters']