下表'mydata'有四列payId,custId,rScore,bScore
payId custId rScore bScore
A 1 0.2 0
A 2 0.3 1
A 4 0.65 1
A 1 0.35 0
B 3 0.5 1
B 5 0.3 1
B 5 0.85 0
将rScore定义为: a)rScore <0.5然后'<0.5', b)rScore&gt; = 0.5然后'&gt; = 0.5'
根据每个payId和范围计算不同的custId。
根据每个payId和范围计算比率(总数1)/(总数1和0)
payId range custId ratio
A <0.5 2 0.33
A >=0.5 1 1
B <0.5 1 1
B >=0.5 2 0.5
这是我尝试的但是我没有得到所需的输出 -
SELECT payId, IF(rscore< 0.5, '<0.5','>=0.5') As range,
Case
When rscore< 0.5 then Count(Distict(custId))
When rscore >=0.5 then Count(Distict(custId)))
End AS custID
From mydata
答案 0 :(得分:0)
您需要将数据分组到第一列,因为您不希望输出中的所有行。如果您使用的是MSSQL,这应该可以使用。
; WITH cte_range AS (
SELECT
*
,CASE WHEN rScore < 0.5 THEN '<0.5' ELSE '>=0.5' END AS range
FROM
mydata
)
SELECT
payId
,range
,COUNT(DISTINCT custId) as distinctCustomers
,AVG(bScore) AS ratio
FROM
cte_range
GROUP BY
payId
,range
ORDER BY
payId
,range;
答案 1 :(得分:0)
您还需要在该计算范围内使用GROUP BY 一种方法是在GROUP BY中重复计算时的情况
SELECT
payId,
(CASE WHEN rscore < 0.5 THEN '<0.5' WHEN rscore >= 0.5 THEN '>=0.5' END) AS range,
COUNT(DISTINCT custId) AS TotalCustId,
AVG(ratio) AS AverageRatio
FROM mydata
GROUP BY
payId,
(CASE WHEN rscore < 0.5 THEN '<0.5' WHEN rscore >= 0.5 THEN '>=0.5' END)
ORDER BY 1, 2;
或者使用子查询和组
SELECT payId, range,
COUNT(DISTINCT custId) AS TotalCustId,
AVG(ratio) AS AverageRatio
FROM (
SELECT payId, custId, ratio,
(CASE
WHEN rscore < 0.5 THEN '<0.5'
WHEN rscore >= 0.5 THEN '>=0.5'
END) AS range
FROM mydata
) AS Q
GROUP BY payId, range
ORDER BY payId, range;
答案 2 :(得分:0)
MSSQL解决方案
with cte as (
select *,range=iif(rscore< 0.5, '<0.5','>=0.5') from t
)
select payid,range,cnt_custid=count(distinct (custid)),
ratio=avg(bScore) from cte
group by payid,range
ORDER BY payId, range;