我有一些看起来像这样的SQL:
SELECT
stageName,
count(*) as `count`
FROM x2production.contact_stages
WHERE FROM_UNIXTIME(createDate) between '2016-05-01' AND DATE_ADD('2016-08-31', INTERVAL 1 DAY)
AND (stageName = 'DI-Whatever' OR stageName = 'DI-Quote' or stageName = 'DI-Meeting')
Group by stageName
Order by field(stageName, 'DI-Quote', 'DI-Meeting', 'DI-Whatever')
这会生成一个如下表格:
+-------------+-------+
| stageName | count |
+-------------+-------+
| DI-quote | 1230 |
| DI-Meeting | 985 |
| DI-Whatever | 325 |
+-------------+-------+
问题:
我想从一行到下一行的百分比。例如,DI-Meeting与DI报价的百分比。数学将是100 * 985/1230 = 80.0%
所以最后这个表看起来像这样:
+-------------+-------+------+
| stageName | count | perc |
+-------------+-------+------+
| DI-quote | 1230 | 0 |
| DI-Meeting | 985 | 80.0 |
| DI-Whatever | 325 | 32.9 |
+-------------+-------+------+
有没有办法在mysql中执行此操作?
这是一个混乱数据的SQL小提琴:http://sqlfiddle.com/#!9/61398/1
答案 0 :(得分:1)
select stageName,count,if(rownum=1,0,round(count/toDivideBy*100,3)) as percent
from
( select stageName,count,greatest(@rn:=@rn+1,0) as rownum,
coalesce(if(@rn=1,count,@prev),null) as toDivideBy,
@prev:=count as dummy2
from
( SELECT
stageName,
count(*) as `count`
FROM Table1
WHERE FROM_UNIXTIME(createDate) between '2016-05-01' AND DATE_ADD('2016-08-31', INTERVAL 1 DAY)
AND (stageName = 'DI-Underwriting' OR stageName = 'DI-Quote' or stageName = 'DI-Meeting')
Group by stageName
Order by field(stageName, 'DI-Quote', 'DI-Meeting', 'DI-Underwriting')
) xDerived1
cross join (select @rn:=0,@prev:=-1) as xParams1
) xDerived2;
+-----------------+-------+---------+
| stageName | count | percent |
+-----------------+-------+---------+
| DI-Quote | 16 | 0 |
| DI-Meeting | 13 | 81.250 |
| DI-Underwriting | 4 | 30.769 |
+-----------------+-------+---------+
注意,您希望第一行的百分比为0。这很容易改为100。
cross join
引入变量以供使用并初始化它们。 greatest
和coalesce
在article中详细说明了变量使用的安全性,以及来自MySQL手册页Operator Precedence的线索。派生表名称只是:每个派生表都需要一个名称。
如果您不遵守这些引用文章中的原则,那么使用变量是不安全的。我并不是说我钉了它,但安全始终是我的重点。
变量的分配需要遵循安全的形式,例如在@rn
或greatest
等函数内部设置的least
变量。我们知道@rn
总是大于0.所以我们使用greatest
函数强制我们对查询的意愿。与coalesce
相同的技巧,null将永远不会发生,并且:=
在其后面的列中具有较低的优先级。也就是说,最后一个:@prev:=
跟在coalesce
之后。
这样,在该选择行中的其他列尝试使用其值之前设置变量。
因此,获得预期结果并不意味着您已经安全地完成了它并且它将与您的实际数据一起使用。
答案 1 :(得分:0)
您需要使用LAG
函数,因为MySQL不支持它,您必须以这种方式模仿它:
select stageName,
cnt,
IF(valBefore is null,0,((100*cnt)/valBefore)) as perc
from (SELECT tb.stageName,
tb.cnt,
@ct AS valBefore,
(@ct := cnt)
FROM (SELECT stageName,
count(*) as cnt
FROM Table1,
(SELECT @_stage = NULL,
@ct := NULL) vars
WHERE FROM_UNIXTIME(createDate) between '2016-05-01'
AND DATE_ADD('2016-08-31', INTERVAL 1 DAY)
AND stageName in ('DI-Underwriting', 'DI-Quote', 'DI-Meeting')
Group by stageName
Order by field(stageName, 'DI-Quote', 'DI-Meeting', 'DI-Underwriting')
) tb
WHERE (CASE WHEN @_stage IS NULL OR @_stage <> tb.stageName
THEN @ct := NULL
ELSE NULL END IS NULL)
) as final
在此处查看:http://sqlfiddle.com/#!9/61398/35
编辑我实际上已编辑它以删除不必要的步骤(子查询)