使用MySQL更新原始表的概率?

时间:2014-09-15 06:56:11

标签: mysql sql probability

source | target
apple  |   dog
dog    |   cat
door   |   cat
dog    |   apple
cat    |   dog              -----step 1.

使用SQL代码:

SELECT GREATEST(source,target),LEAST(source,target),COUNT(*) FROM my_table GROUP BY GREATEST(source,target),LEAST(source,target); 

将是

apple dog 2
dog   cat 2
door  cat 1                 ------step2.

所以我想计算概率并更新为名称调用" prob"柱

source | target | prob
apple  |   dog  | 2/(2+2+1)
dog    |   cat  | 2/(2+2+1)
door   |   cat  | 1/(2+2+1)
dog    |   apple| 2/(2+2+1)  
cat    |   dog  | 2/(2+2+1)    -------step3.

如何从第1步到第3步。

1 个答案:

答案 0 :(得分:1)

DROP TABLE IF EXISTS my_table;

CREATE TABLE my_table
(source VARCHAR(12) NOT NULL,target VARCHAR(12) NOT NULL
,PRIMARY KEY(source,target)
);
INSERT INTO my_table VALUES
('apple','dog'),
('dog','cat'),
('door','cat'),
('dog','apple'),
('cat','dog');

SELECT x.*
     , y.total/(SELECT COUNT(*) FROM my_table) prob 
  FROM my_table x
  JOIN 
     ( SELECT GREATEST(source,target) g,LEAST(source,target) l,COUNT(*) total FROM my_table GROUP BY g,l ) y
    ON (y.g = x.source AND y.l = x.target) 
    OR (y.g = x.target AND y.l = x.source);

+--------+--------+--------+
| source | target | prob   |
+--------+--------+--------+
| apple  | dog    | 0.4000 |
| dog    | apple  | 0.4000 |
| cat    | dog    | 0.4000 |
| dog    | cat    | 0.4000 |
| door   | cat    | 0.2000 |
+--------+--------+--------+

......或类似的东西