A有一个包含交易者统计数据的表格:
id | number of trades | profit
1 | 10 | 1.05
2 | 20 | 1.06
3 | 1000 | 1.06
4 | 100 | 0.95
5 | 150 | 1.06
6 | 20 | 1.06
7 | 20 | 1.07
...
我想按行数建立 100区间的表格,每个区间应该有相同数量的交易者:
number of trades | number of traders | average profit
0-156 | 1500 | 1.05
156-1500 | 1500 | 0.95
1500-1610 | 1500 | 1.55
....
查询会是什么样的?
答案 0 :(得分:1)
将问题分解为更小的部分。 请参阅Polya如何解决
按时间间隔分组(百分比为100个间隔) (只是不要在输出中显示间隔号)
Select
Interval,
Min(NumOfTrades) as minTrades,
Max(NumOfTrades) as maxTrades,
Count(*) as NumOfTraders,
Avg(profit) as AvgProfit
From
... and some magic here, see below ...
Group by Interval
创建间隔
(Select
SeqByTradesAndID / 100 as Interval,
id,
NumOfTrades,
profit
From
... again, some inner workings ...
)
按升序交易和ID
的序列号 (Select
(Select Count(*) From Traders as T2
Where ( T2.NumOfTrades < T1.NumOfTrades )
Or ( T2.NumOfTrades = T1.NumOfTrades
And T2.id < T1.id)
) as SeqByTradesAndID,
T1.id,
T1.NumOfTrades,
T1.profit
From Traders as T1
我认为应该这样做,但我没有测试过。
答案 1 :(得分:1)
看看下面的查询。希望这能解决问题。
SELECT MIN(number_of_trades)::text || '-' || MAX(number_of_trades)::text AS "number of trades"
, COUNT(number_of_trades) AS "number of traders"
, AVG(number_of_trades*profit) AS AvgProfit
FROM
(
SELECT ROW_NUMBER() OVER(ORDER BY number_of_trades) AS rn
, number_of_trades
, profit
, (COUNT(*) OVER(PARTITION BY NULL) / 100) AS Grp
FROM TradersStat
) tab
GROUP BY (rn - 1) / Grp
在这里你可以看到它有效:SQL Fiddle
解决方案假定记录数量至少为100.
算法如下:
[编辑]
COUNT(*) OVER(PARTITION BY NULL)
计算交易者总数。当我们将其除以100时,得到的是每组中的交易者数量。通常,PARTITION BY
指示数据集需要分区的列和COUNT(*)
计算每个分区中的记录数。 PARTITION BY NULL
可以将数据集中的所有记录视为单个分区,因此COUNT(*)
就可以了,它基本上是计算数据集中的所有记录。我们不能在没有OVER()
子句的情况下使用COUNT(*),因为COUNT(*)
是一个聚合函数,并且使用情况无效。