将元素从一个列表匹配到另一个列表,跳过已匹配的元素Python

时间:2011-10-05 22:32:33

标签: python list text-processing

对你们许多人来说,解决这个问题的方法很明显,但我陷入了困境,所以我想我会问。

我有以下格式的两个列表:

target_list =['apples 1', 'oranges 1', 'bananas 2', 'apples 3', 'oranges 2','mango 3', 'apples 2']
source_list =  ['A apples', 'B mango', 'C apples', 'D bananas', 'E oranges','F apples', 'G oranges']

我需要遍历target_items中的每个target_list,如果target_item()[0]source_item()[1]中的source_list匹配,{{ 1}}。在输出中没有重复的source_item / target_item对

是至关重要的

这就是我的意思。假设我使用常规的旧for循环:

return target_item()[0],source_item()[0], target_item()[1]

我得到的(不正确的)输出是:

for target_item in target_list:
        for source_item in source_list: 
                if source_item.split()[1] == target_item.split()[0]: 
                        print target_item.split()[0], source_item.split()[0],  target_item.split()[1]

请注意,源/目标对苹果A,苹果C,苹果F每次重复3次,数字不同。橙子对也是如此。我需要的是

apples A 1
apples C 1
apples F 1
oranges E 1
oranges G 1
bananas D 2
apples A 3
apples C 3
apples F 3
oranges E 2
oranges G 2
mango B 3
apples A 2
apples C 2
apples F 2

即,每个条目应始终具有不同的源和目标。

此外,对于每组'apple $ LETTER'和'range $ LETTER'对,数字标签是否以不同方式置换无关紧要。所以,以下是同样好的输出:

apples A 1
apples C 2
apples F 3
oranges E 1
oranges G 2
bananas D 2
mango B 3

1 个答案:

答案 0 :(得分:2)

target_list =['apples 1', 'oranges 1', 'bananas 2', 'apples 3', 'oranges 2','mango 3', 'apples 2']
source_list =  ['A apples', 'B mango', 'C apples', 'D bananas', 'E oranges','F apples', 'G oranges']

from collections import defaultdict

# you want each target fruit to be a group, so use them as keys in a dict
# use a defaultdict list so whenever you access a key that doesn't exist
# it creates an empty list at that key
td = defaultdict(list)

for item in target_list:
    key, value = item.split()
    # the value for each fruit is a list of the numbers associated with it
    td[key].append(value)

# for each source item find a match and pop a number from the list
# so that each pair gets a different number
for item in source_list:
    letter, key = item.split()
    if key in td:
        print key, letter, td[key].pop()