Question

我想重新格式化MySql表以用于网络节点映射程序。原始格式为：

| ID | story | org | scribe |

我想把所有的组织名称拉成两个输出表：

| org1 | org2 | scribe | weight of connection |

org1和org2都来自原始表中的相同字段，并且通过共享一个或多个划线彼此相关。所有文士都有独特的身份证。当然，我不想重复输入。

我 CAN 到目前为止所做的是通过对组织执行'％text％'搜索然后从该组织中排除该组织，将所有连接到列表中任何一个组织的组织拉出来输出，如下：

SELECT 'tabitha' as org1,
org as org2,
teller as scribe_id,
count(teller) as weight
FROM `stories`
WHERE teller in
 (
 (SELECT
 teller
 FROM `stories`
 WHERE org like '%tabitha%'
 group by teller)
 )
 and org not like '%tabitha%'
 group by teller, org

所以我觉得有一些关于自我联接或案例的伎俩可能有用，但我还没有找到任何东西。

Answer 1

我不清楚你要做什么，但也许是这样的？

select t1.org as org1, t2.org as org2, teller as scrib_id, count(teller) as weight 
from stories t1 join stories t2 where t1.teller=t2.teller and t1.org!=t2.org
group by teller,t1.org

这将在出纳员的t1和t2（两个同一个表）之间执行连接，它会排除加入自己的记录

我可能会离开，但也许某些版本的连接语法可能会有所帮助。

Answer 2

此查询有效。仅从解决方案中调整的是它没有正确计算权重。

select t1.org as org1,
       t2.org as org2,
       t1.teller as scrib_id,
       count(distinct t1.story) as weight
       /* need to count the stories instead of the scribes now */    
from stories t1 join stories t2
where t1.teller=t2.teller
    and t1.org!=t2.org and t1.org not in ('none','[swahili]','[]')
    /* this just excludes nonsense categories */
    and t2.org not in ('none','[swahili]','[]')
group by t1.teller,t1.org
order by weight desc, t1.org;

对于我的下一个问题 - 我甚至不知道是否可能，你能请求sql在柜员上进行APPROXIMATE匹配或抄写吗？如果这些ID是电话号码并且有人忘记了其中一个数字，我仍然希望将它们组合在一起。我认为这对mysql来说太难了 - 我需要python或其他东西。

MySql查询获取同一表字段中的所有元素组合

2 个答案: