Question

我在CSV文件中设置了以下数据。

Entity_A,Category1,Rule1,1990,1992,2
Entity_B,Category1,Rule1,1990,1993,3
Entity_C,Category2,Rule2,1992,1994,2
Entity_A,Category2,Rule2,1992,1993,1
Entity_B,Category2,Rule2,1992,1993,1
Entity_C,Category1,Rule1,1990,1994,4

它基本上说：Entity_A在1992年实施了Rule1，其中Rule1是在1990年提出的。它实施的时间是2年（减去1992-1990）。

我已经实现了一个函数，它可以给出两个状态之间有多少通用规则。这是代码：

print set(item[2] for item in L if item[0]=='Entity_A').intersection([item[2] for item in L if item[0]=='Entity_B']) #this gives the common rules between A and B. In this case 2.

我想实现以下内容：让我们说如果Entity_A在1992年实现了一个规则，那么我想知道Entity_B在A之后实现了多少规则。在上面的数据集中，答案是Entity_A - Entity_B = 1，因为B已经实现了1个规则实施了这条规则。 基本上B遵循A。

在这种情况下，我需要比较A的item[5]和B的item[5]。如何在集合计算中比较这些并计算结果？我基本上想要打印以下内容：

Entity_A，Entity_B，1 - ＆gt;这是A和B之间的关系，其中B在一个规则中遵循A.

Entity_A，Entity_C，2 - ＆gt; C遵循两个规则实现

Answer 1

def numRulesBImplementedAfterA(a, b, L):
    a_date_map = {item[1] : item[3] for item in L if item[0] == a}
    count = 0
    for item in L:
        if item[0] != b:
            continue
        b_rule = item[1]
        b_implemented_date = item[3]
        a_implemented_date = a_date_map.get(b_rule)
        if a_implemented_date is None:
            continue
        if b_implemented_date > a_implemented_date:
            count += 1
    return count

Answer 2

由于您在没有必要信息的情况下构建集合，因此无法将这些与集合表示法进行比较。如果你想构建一个在Entity_B之后实现规则Entity_A的集合，你将需要做更多的工作：

a_impl = [(x[1], x[3]) for x in L if x[0] == 'Entity_A'] # (rule_number, year_implemented)
b_impl = [(x[1], x[3]) for x in L if x[0] == 'Entity_B'] # same for b
b_impl_after_a = filter(lambda k: k[0] == k[2] and k[1] < k[3], [x+y for x in a_impl for y in b_impl])

这是我的头脑，所以你可以做得更聪明一点，但它的工作原理。 a_impl包含Entity_A实现的所有规则的两元组以及实现它们时的两个元组，同样适用于b_impl和Entity_B。

[b_impl中y的a_impl中的x的x + y]为所有这些规则构造了一个交叉乘积，过滤器只拉出规则相同且Entity_A在Entity_B之前实现的那些。

如果您只想要规则的名称，可以迭代过滤器：

b_impl_after_a = [x[0] for x in b_impl_after_a]

Answer 3

set(item[2] for item in L if (item[0]=='Entity_A' and int(item[5]) == k)).intersection([item[2] for item in L if (item[0]=='Entity_B' and int(item[5]) > k)]) #k is a counter for item[5]

设置两个条件的交集

3 个答案: