基于传递性的数据框架

时间:2017-08-22 14:26:35

标签: sql r transitive-closure-table

我有一个数据框

A:    V1 V2  
      1   3    
      1   4    
      3   4    
      1   6
      6   5

我想要在V1和V2上满足传递属性的输出

B:    V1 V2 V3
       1  3  4   

1 个答案:

答案 0 :(得分:1)

您的想法是选择一个来源并尝试找到两个目标的传递性。如果它们是相同的那么你就有了正确的组合。

我为调试目的添加了其他列,但查询可以简化一点。

<强> SQL DEMO

SELECT *
FROM (
        SELECT source.[V1], source.[V2],
               target1.[V1] as t1_v1,
               target1.[V2] as t1_v2,
               target2.[V1] as t2_v1,
               target2.[V2] as t2_v2,
               CASE WHEN source.[V1] = target1.[V1] 
                    THEN target1.[V2]
                    ELSE target1.[V1]
               END as transitive1,
               CASE WHEN source.[V2] = target2.[V2] 
                    THEN target2.[V1]
                    ELSE target2.[V2]
               END as transitive2     
        FROM A as source
        JOIN A as target1
          ON      (source.[V1] = target1.[V1] OR source.[V1] = target1.[V2])
          AND NOT (source.[V1] = target1.[V1] AND source.[V2] = target1.[V2])
        JOIN A as target2    
          ON      (source.[V2] = target2.[V1] OR source.[V2] = target2.[V2])
          AND NOT (source.[V1] = target2.[V1] AND source.[V2] = target2.[V2])
     ) T
WHERE T.transitive1 = T.transitive2

<强>输出

enter image description here

要获得您想要的结果,请选择正确的列并添加aditional过滤器

SELECT T.[V1] as [V1], 
       T.[V2] as [V2], 
       T.[transitive1] as [V3]

....

WHERE T.[V1] > T.[V2]
  AND T.[V2] > T.[transitive1]
  AND T.transitive1 = T.transitive2