我正在尝试获得最常见的数字和三连(3/3)数字,我的表看起来像这样:
+----+------+------+------+------+------+------+------+------+------+------+
| id | nr1 | nr2 | nr3 | nr4 | nr5 | nr6 | nr7 | nr8 | nr9 | nr10 |
+----+------+------+------+------+------+------+------+------+------+------+
| 1 | 1 | 39 | 19 | 23 | 28 | 80 | 3 | 42 | 60 | 32 |
+----+------+------+------+------+------+------+------+------+------+------+
| 2 | 43 | 18 | 3 | 24 | 29 | 33 | 15 | 1 | 61 | 80 |
+----+------+------+------+------+------+------+------+------+------+------+
| 3 | 11 | 25 | 33 | 2 | 30 | 3 | 1 | 44 | 62 | 78 |
+----+------+------+------+------+------+------+------+------+------+------+
我想知道我所有行中最常见的三对或三对数字。
示例:
1,3(3次)
1,80(2次)
3,80(2次) 1,3,80(2次)
我可以尝试按顺序添加数字,如1,2,3,然后从数据库中提取它们但是我想出的脚本仍然很糟糕,需要花费数小时来检查10000行
欢迎任何想法.. 非常感谢你。
答案 0 :(得分:1)
你需要取消你的表格,但是mysql没有启动功能,所以你可以
<强> SQL Fiddle Demo 强>
CREATE TABLE unpivot
SELECT *
FROM (
SELECT id, nr1 as n_value FROM tuple union all
SELECT id, nr2 as n_value FROM tuple union all
SELECT id, nr3 as n_value FROM tuple union all
SELECT id, nr4 as n_value FROM tuple union all
SELECT id, nr5 as n_value FROM tuple union all
SELECT id, nr6 as n_value FROM tuple union all
SELECT id, nr7 as n_value FROM tuple union all
SELECT id, nr8 as n_value FROM tuple union all
SELECT id, nr9 as n_value FROM tuple union all
SELECT id, nr10 as n_value FROM tuple
) as T
现在查看与自身进行连接的对的数量。
SELECT n1, n2, count(*) as total
FROM
(
SELECT up1.n_value as n1, up2.n_value as n2
FROM unpivot up1
JOIN unpivot up2
ON up1.`id` = up2.`id`
AND up1.n_value < up2.n_value
) T
GROUP BY n1, n2
ORDER BY total desc
LIMIT 3;
对于三胞胎你加入表三次
SELECT n1, n2, n3, count(*) as total
FROM
(
SELECT up1.n_value as n1, up2.n_value as n2, up3.n_value as n3
FROM unpivot up1
JOIN unpivot up2
ON up1.`id` = up2.`id`
AND up1.n_value < up2.n_value
JOIN unpivot up3
ON up2.`id` = up3.`id`
AND up2.n_value < up3.n_value
) T
GROUP BY n1, n2, n3
ORDER BY total desc
LIMIT 3;
更新:
我在postgresql上进行了测试
创建50k行,随机值从1到90
创建索引后,查询只需要2秒即可完成。