----------------------------------------
ColumnA | ColumnB | ColumnC |
----------------------------------------
Cat | Shirt | Pencil |
Dog | Shirt | Eraser |
Worm | Dress | Pen |
Cow | Shirt | Pen |
Cat | Shirt | Pen |
Cat | Jacket | Pen |
Cow | Shirt | Pen |
Cat | Shirt | Pen |
Cat | Jacket | Pen |
Cow | Shirt | Pen |
Cat | Shirt | Pen |
Cat | Jacket | Pen |
根据上面的示例数据,我试图找到最重现的组合,它们是一对2或更大。
例如
Shirt,Pen 6
Cat,Pen 6
Cat,Shirt 4
Jacket, Pen 3
Pen,Cow 3
Cat,Shirt,Pen 3
Cat,Jacket,Pen 3
Cow,Shirt,Pen 3
我最多需要10列数据。
Cat,Shirt
与Shirt,Cat
相同。
使用的最佳算法是什么?最好在SQL中,但我也可以尝试PHP?
答案 0 :(得分:3)
您可以通过识别每一行并添加“空”元素在SQL中执行此操作。注意:这假设每列中的值不同 - 或者至少是可互换的(与第一列无关)。
我还假设每行都有一个唯一的ID:
with t as (
select id, col
from data d outer apply
(values (col1), (col2), (col3), (NULL)) v(col)
)
select t1.col, t2.col, t3.col, count(*)
from t t1 join
t t2
on t1.id = t2.id and (t2.col > t1.col or t2.col is null) join
t t3
on t1.id = t3.id and (t3.col > t2.col or (t2.col is null and t3.col is null))
group by t1.col, t2.col, t3.col
order by count(*) desc;
答案 1 :(得分:3)
这可能是一种方式
SELECT c1, c2, c3, count(*) FROM (
SELECT ColumnA AS c1, ColumnB AS c2, NULL AS c3 FROM your_table
UNION ALL
SELECT ColumnA AS c1, ColumnC AS c2, NULL AS c3 FROM your_table
UNION ALL
SELECT ColumnB AS c1, ColumnC AS c2, NULL AS c3 FROM your_table
UNION ALL
SELECT ColumnA AS c1, ColumnB AS c2, ColumnC AS c3 FROM your_table
) tt
group by c1, c2, c3
order by count(*) desc