列出列表列表中的重复列表,但在某些条件下除外

时间:2015-01-27 21:29:00

标签: python python-2.7

我有一个与它们相关的任务类型和项目列表的列表。总共有4种类型的任务。我想生成一个包含多个任务类型的项目列表,除非使用某些任务类型对。我已经弄清楚如何获得具有多个任务的项目列表,但不知道如何排除排除组合。

要从输出中排除的组合对 (任务类型1,任务类型4),(任务类型3,任务类型4)

如果项目除了其他项目之外还有排除对,则应将其包含在输出中。

输入:

my_list = [['Task Type 1', 'Project 1'],['Task Type 2', 'Project 1'],['Task Type 4', 'Project 1'],
          ['Task Type 3', 'Project 2'],['Task Type 4', 'Project 2'],
          ['Task Type 1', 'Project 3'],['Task Type 1', 'Project 3'],
          ['Task Type 4', 'Project 4']]

开始代码:

from collections import Counter
my_project_list = zip(*my_list)[1]
cnt = Counter(my_project_list)
my_duplicate_list = [k for k, v in cnt.iteritems() if v > 1]
print my_duplicate_list

期望的输出:

['Project 1', 'Project 3']

1 个答案:

答案 0 :(得分:1)

这是一种方式:

首先,我们将创建从项目到其类型列表的映射。

然后,我们将创建一个过滤器,该过滤器接收规则列表并仅返回符合任何规则的项目。

所以这里是完整的代码详细说明(感谢@DSM):

#!/usr/bin/env python
from collections import defaultdict

my_list = [
    ['Task Type 1', 'Project 1'],
    ['Task Type 2', 'Project 1'],
    ['Task Type 4', 'Project 1'],
    ['Task Type 3', 'Project 2'],
    ['Task Type 4', 'Project 2'],
    ['Task Type 1', 'Project 3'],
    ['Task Type 1', 'Project 3'],
    ['Task Type 4', 'Project 4']
]

# create mapping according to our filter value
# in our case, project to it's types
projects_to_types = defaultdict(list)
for x in my_list:
    projects_to_types[x[1]].append(x[0])

# sort all lists of types - this promises
# the equation of two identical lists
# returns the same results (lists have order)
projects_to_types = {k:sorted(v) for k, v in projects_to_types.iteritems()}

# a function to create a filter over a mapping
# like the one we created, the filter is a generator
def rules_filter_generator(original):
    # take a list of rules and filter out keys whose
    # values match any rule
    def filter_restricted(rules, minimum_length=2):
        # a set will give us better, more readable and faster code.
        # convert to tuples since list isn't hashable (mutable).
        rule_set = set(map(lambda x: tuple(sorted(x)), rules))
        for k, v in original.iteritems():
            if len(v) >= minimum_length and not tuple(v) in rule_set:
                yield k
    return filter_restricted

# use the filter specifically on the mapping we've created
generator = rules_filter_generator(projects_to_types)

# test (consume the generator to a list)
print list(generator([
    ['Task Type 3', 'Task Type 4'],
    ['Task Type 3', 'Task Type 3', 'Task Type 4']
]))

# prints: set(['Project 3', 'Project 1'])