我有一个SQL查询A
(详见下文),它返回一个表如下:
cluster brand amount
0 bos 600
0 phi 300
0 har 100
1 pro 2500
1 wal 1500
1 ash 1000
2 dil 4200
2 sor 500
2 van 300
...
但是,我想要显示的不是金额,而是显示该金额与该群集中总金额的比例,如下表所示:
cluster brand amount
0 bos 0.60
0 phi 0.30
0 har 0.10
1 pro 0.50
1 wal 0.30
1 ash 0.20
2 dil 0.84
2 sor 0.10
2 van 0.06
...
我应该如何更改我的SQL,以便我可以访问一个群集中所有金额的总和,并且仍然有多个具有相同群集的行?
**详情**
SQL服务器:MySQL,通过python-MySQL连接器接口。
生成第一个表的当前SQL查询:
SELECT c.cluster, brand, COUNT(o.id) AS brand_amount
FROM nyon_all.clustering AS c
LEFT JOIN nyon_all.persons AS p ON c.pid = p.id
LEFT JOIN nyon_all.orders AS o ON p.id = o.pid
LEFT JOIN nyon_all.articles AS a ON o.aid = a.id
LEFT JOIN nyon_all.brands AS ab ON a.brand_id = ab.id
WHERE c.cluster_round = 'Org_2014-08-27_10:45:35'
GROUP BY cluster, brand
HAVING brand_amount > 100
ORDER BY c.cluster ASC, brand_amount DESC;
表orders
(主键id
)将persons
(外键pid
)与articles
(外键aid
)相关联。 Articles
具有特定品牌(外键brand_id
),与表brands
中的名称相关。
可以使用以下SQL查询检索每个群集的文章总数:
SELECT c.cluster, COUNT(o.pid) AS amount
FROM nyon_all.clustering AS c
LEFT JOIN nyon_all.persons AS p ON c.pid = p.id
LEFT JOIN nyon_all.orders AS o ON p.id = o.pid
WHERE c.cluster_round = 'Org_2014-08-27_10:45:35'
GROUP BY cluster
ORDER BY c.cluster ASC, amount DESC;
结果:
cluster amount
0 1000
1 5000
2 5000
但是,我似乎无法将两个SQL查询结合起来。
答案 0 :(得分:2)
您可以对按群集汇总金额的子查询进行连接
select t1.cluster, amount / sumAmount
from Table1 t1
join (select cluster, sum(amount) as sumAmount
from Table1
group by cluster)s
on t1.cluster = s.cluster
请参阅SqlFiddle
修改
SELECT
c.cluster,
brand,
COUNT(o.id) / coalesce(s.sumBrandAmount, 0) AS brand_amount -- of course it would be nice to check for dividing by 0
FROM nyon_all.clustering AS c
LEFT JOIN nyon_all.persons AS p ON c.pid = p.id
LEFT JOIN nyon_all.orders AS o ON p.id = o.pid
LEFT JOIN nyon_all.articles AS a ON o.aid = a.id
LEFT JOIN nyon_all.brands AS ab ON a.brand_id = ab.id
LEFT JOIN (select c1.id, count(o1.id) as sumBrandAmount
from nyon_all.clustering c1
left join nyon_all.persons p1 on p1.id = c1.pid
left join nony_all.orders as o1 on o1.id = p1.id
--maybe some where clause as in your main query
group by c1.id) s
ON s.id = c.id
WHERE c.cluster_round = 'Org_2014-08-27_10:45:35'
GROUP BY cluster, brand
HAVING brand_amount > 100
ORDER BY c.cluster ASC, brand_amount DESC;