如何计算两个用户的共同邻居并计算相似度?

时间:2010-11-15 16:10:32

标签: sql algorithm social-networking similarity

monthly_connections表包含calling_party, called_party, common_neighbors, neighborhood_overlap

因此该表描述了哪些用户已连接。用户相似性的度量之一是邻域重叠,其定义如下:

  

neighborhood_overlap =(数量   与两者相邻的节点   calling_party和   called_pa​​rty)/(节点数   至少有一个的邻居   calling_party或called_pa​​rty)

尝试计算两个用户的常见邻居数量,我编写了以下查询:

SELECT 
COUNT (*) FROM 
(SELECT t1.neighborA 
     FROM (
          SELECT called_party AS neighborA FROM monthly_connections 
          WHERE calling_party = '9F7334BCF9000CD68D40302DC4801E60C027A7D1' 
          UNION SELECT calling_party AS neighborA FROM monthly_connections
                WHERE called_party = '9F7334BCF9000CD68D40302DC4801E60C027A7D1') t1                  
          INNER JOIN (SELECT called_party AS neighborB FROM monthly_connections 
                      WHERE calling_party = '10D149A4356E1AA3A8AF604BD992BBA141DB53D2'
                      UNION SELECT calling_party AS neighborB FROM monthly_connections
                            WHERE called_party = '10D149A4356E1AA3A8AF604BD992BBA141DB53D2') t2 ON t1.neighborA = t2.neighborB) t3

上面的查询计算用户的公共邻居数量10D149A4356E1AA3A8AF604BD992BBA141DB53D2和9F7334BCF9000CD68D40302DC4801E60C027A7D1

目标是编写查询以设置列公共邻居的值以及表中每对连接的邻域重叠

有谁知道如何编写查询来更新common_neighbors和neighborhood_overlap列?

对于普通邻居,我开始编写以下查询,但这不正确:

 UPDATE mc SET
    common_neighbors = 
    (SELECT COUNT (*) FROM 
(SELECT t1.neighborA FROM (SELECT called_party AS neighborA FROM monthly_connections WHERE calling_party = mc.calling_party UNION SELECT calling_party AS neighborA FROM monthly_connections WHERE called_party = mc.calling_party) t1 INNER JOIN (SELECT called_party AS neighborB FROM monthly_connections WHERE calling_party = mc.called_party UNION SELECT calling_party AS neighborB FROM monthly_connections WHERE called_party = mc.called_party) t2 ON t1.neighborA = t2.neighborB) t3) FROM monthly_connections mc INNER JOIN t3 ON t3.calling_party = mc.calling_party AND t3.called_party = mc.called_party

1 个答案:

答案 0 :(得分:1)

我认为这个查询有效(尽管可能不是那种表现)。

UPDATE mc 
   SET common_neighbors = (SELECT COUNT (*) FROM
     (
      (SELECT called_party FROM monthly_connections 
        WHERE calling_party = mc.calling_party
         UNION
       SELECT calling_party FROM monthly_connections 
        WHERE called_party = mc.calling_party
      )
        INTERSECT
      (SELECT calling_party FROM monthly_connections 
       WHERE called_party = mc.called_party
         UNION
       SELECT called_party FROM monthly_connections 
        WHERE calling_party = mc.called_party
      )
     ) t1
   ) FROM monthly_connections mc