如何使用JOIN而不是UNION来计算“A OR B”的邻居?

时间:2010-11-28 09:39:27

标签: sql sql-server-2008

以下查询计算图中两个节点的公共邻居:

    DECLARE @monthly_connections_test TABLE (
  calling_party VARCHAR(50)
  , called_party VARCHAR(50))

INSERT INTO @monthly_connections_test
          SELECT 'z1', 'z2'
UNION ALL SELECT 'z1', 'z3'
UNION ALL SELECT 'z1', 'z4'
UNION ALL SELECT 'z1', 'z5'
UNION ALL SELECT 'z1', 'z6'
UNION ALL SELECT 'z2', 'z1'
UNION ALL SELECT 'z2', 'z4'
UNION ALL SELECT 'z2', 'z5'
UNION ALL SELECT 'z2', 'z7'
UNION ALL SELECT 'z3', 'z1'
UNION ALL SELECT 'z4', 'z7'
UNION ALL SELECT 'z5', 'z1'
UNION ALL SELECT 'z5', 'z2'
UNION ALL SELECT 'z7', 'z4'
UNION ALL SELECT 'z7', 'z2'

SELECT     monthly_connections_test.calling_party AS user1, monthly_connections_test_1.calling_party AS user2, COUNT(*) AS calling_calling, 0 AS calling_called, 
                      0 AS called_calling, 0 AS called_called, 0 AS both_directions
FROM         @monthly_connections_test AS monthly_connections_test INNER JOIN
                      @monthly_connections_test AS monthly_connections_test_1 ON 
                      monthly_connections_test.called_party = monthly_connections_test_1.called_party AND 
                      monthly_connections_test.calling_party < monthly_connections_test_1.calling_party
GROUP BY monthly_connections_test.calling_party, monthly_connections_test_1.calling_party

对于下图 alt text

它返回由user1和user2调用的公共邻居的数量,例如,由z1和z2调用的邻居的数量,它返回2作为调用z4和z5。

我想要计算的另一件事是由user1或user2调用的两个节点(用户)的所有邻居的数量,例如对于对(z1,z2),查询应返回5(用户z1调用) z2,z3,z4,z5,z6和用户z2称为z1,z4,z5,z7 - 必须排除z1和z2之间的连接,因为(z1,z2)是观察对和(z3中的元素数量, z4,z5,z6)U(z4,z5,z7)为5)。

有谁知道如何修改/创建上述逻辑的连接查询?

谢谢!

5 个答案:

答案 0 :(得分:2)

@ Martin的回答是正确的。他是个天才。

去马丁!

<强> CORRECTION

如果针对我提供的双向解决方案运行,他的回答可以进行1次小修改。否则结果不正确。

所以你的答案是他和我的:)

完整的解决方案:

DECLARE @T1 TABLE (calling_party VARCHAR(50), called_party VARCHAR(50))

INSERT  INTO @T1
SELECT  *
FROM    dbo.monthly_connections_test

INSERT  INTO @T1
SELECT  *
FROM    (
        SELECT  called_party AS calling_party, calling_party AS called_party
        FROM    dbo.monthly_connections_test AS T2
        WHERE   T2.called_party < T2.calling_party
        ) T2
WHERE   NOT EXISTS (
        SELECT *
        FROM    monthly_connections_test
        WHERE   calling_party = T2.calling_party and called_party = T2.called_party
)

select u1, u2, count(called_party) called_parties 
from (
select distinct u1, u2, called_party from 
(
        select a1.calling_party u1, a2.calling_party u2 from 
        (select calling_party from @T1 group by calling_party) a1,
        (select calling_party from @T1 group by calling_party) a2
) pairs,
 @T1 AS T
where
(u1 <> u2) and 
((u1 = t.calling_party and u2 <> t.called_party) or
(u2 = t.calling_party and u1 <> t.called_party))
) res
group by u1, u2
order by u1, u2

答案 1 :(得分:1)

我这里没有SQL Server,但应该工作:

select u1, u2, count(called_party) called_parties 
from (
select distinct u1, u2, called_party from 
(
    select a1.calling_party u1, a2.calling_party u2 from 
        (select calling_party from @monthly_connections_test group by calling_party) a1,
        (select calling_party from @monthly_connections_test group by calling_party) a2
) pairs,
 @monthly_connections_test t
where 
(u1 = t.calling_party and u2 <> t.called_party) or
(u2 = t.calling_party and u1 <> t.called_party)
) res
group by u1, u2;

pairs子查询简单创建了所有可能的用户对,你可能在其他地方有一个用户列表。

答案 2 :(得分:0)

出于兴趣,没有z1也调用z2,反之亦然,使得期望的结果(z2,z3,z4,z5,z6)U(z1,z4,z5,z7)是7?

COMPUTE操作会为您提供所需的计数吗?

答案 3 :(得分:0)

好的,这是一个非常难以破解的坚果!

第一个问题是表中的数据是双向的。解决这个问题的第一步是使数据单向化。

DECLARE @T1 TABLE (calling_party VARCHAR(50), called_party VARCHAR(50))
DECLARE @T2 TABLE (calling_party VARCHAR(50), called_party VARCHAR(50))

INSERT  INTO @T1
SELECT  *
FROM    dbo.monthly_connections_test

INSERT  INTO @T1
SELECT  *
FROM    (
        SELECT  called_party AS calling_party, calling_party AS called_party
        FROM    dbo.monthly_connections_test AS T2
        WHERE   T2.called_party < T2.calling_party
        ) T2
WHERE   NOT EXISTS (
        SELECT *
        FROM    monthly_connections_test
        WHERE   calling_party = T2.calling_party and called_party = T2.called_party
)

INSERT  INTO @T2
SELECT  DISTINCT TOP (100) PERCENT calling_party, called_party
FROM    @T1
WHERE   calling_party < called_party
UNION
SELECT  DISTINCT TOP (100) PERCENT called_party AS calling_party, calling_party AS called_party
FROM    @T1
WHERE   calling_party > called_party

上述完全通过将数据展开为不同的1:1关系来解决任何双向问题。结果只有9条记录代表原始数据的每个关系。

我们(是的,在这些时间之后,这也是我现在的问题)应该能够查询结果以获得所需的邻居。这是下一个障碍......

答案 4 :(得分:0)

Niko,我相信这个问题的表格示例中缺少数据点。我为测试添加了以下内容。

UNION ALL SELECT 'z1', 'z6'

我有两个简单的问题来回答这些问题:

“user1和user2调用的公共邻居数”

“我想计算两个节点(用户)的所有邻居的数量,这些节点由user1或user2调用”

declare @Party1 varchar(10)
declare @Party2 varchar(10)
set @Party1 = 'z1'
set @Party2 = 'z2'
select count(distinct called_party) AS 'Total calls 2 neighbors' 
from @monthly_connections_test
WHERE calling_party in (@Party1, @Party2)
AND called_party not in (@Party1 , @Party2)

;With cteAllCalls(x) as
(
Select called_party from @monthly_connections_test 
where called_party != @Party1 and calling_party = @Party2
 )

select Count(X) AS 'Total common calls' from cteAllCalls
inner join @monthly_connections_test on x = called_party
and called_party != @Party2 and calling_party = @Party1