postgresql单独列中不同行的百分比比较

时间:2014-08-08 00:07:55

标签: postgresql

我在POSTGRESQL中有一个表,它实际上是从一堆JOINS生成的VIEW,最终看起来像这样:

 test_type |  brand  | model  | band | firmware_version | avg_throughput
-----------+---------+--------+------+------------------+----------------
 1client   | Linksys | N600   | 5ghz | 1.5              |          66.94
 1client   | Linksys | N600   | 5ghz | 2.0              |          94.98
 1client   | Linksys | N600   | 5ghz | 2.11             |         132.40
 1client   | Linksys | EA6500 | 5ghz | 1.5              |         216.46
 1client   | Linksys | EA6500 | 5ghz | 2.0              |         176.79
 1client   | Linksys | EA6500 | 5ghz | 2.11             |         191.44

我想要完成的是创建另一个列,该列将比较并显示每个模型的不同throughput之间firmware versions的百分比差异。

更具体地说,查询将获取最低固件版本的吞吐量,并将其保存为与所有其他固件版本的吞吐量进行比较的基础。

因此,如果我们采用Linksys N600,最低固件版本为1.5,吞吐量为66.94,我们会将其保存为基准,并将其他吞吐量与该数字进行比较并显示百分比差异。

表格的最终结果如下:

 test_type |  brand  | model  | band | firmware_version | avg_throughput | comparison
-----------+---------+--------+------+------------------+----------------+------------
 1client   | Linksys | N600   | 5ghz | 1.5              |          66.94 | 0% (or empty)
 1client   | Linksys | N600   | 5ghz | 2.0              |          94.98 | +41.61%
 1client   | Linksys | N600   | 5ghz | 2.11             |         132.40 | +97.78%
 1client   | Linksys | EA6500 | 5ghz | 0.5              |         216.46 | 0% (or empty)
 1client   | Linksys | EA6500 | 5ghz | 1.2              |         176.79 | -18.32%
 1client   | Linksys | EA6500 | 5ghz | 2.5              |         191.44 | -11.55%

关于如何做到这一点的任何想法?

我喜欢保持逻辑和分离,现在我不考虑在我的代码中进行这种计算,我宁愿在我的数据库上完成这个,然后只显示结果但是如果不这样做我愿意接受建议有道理。

3 个答案:

答案 0 :(得分:2)

在视图上使用子查询,通过window function从每个模型的最早固件版本返回基本吞吐量,然后将您的视图加入到:

select
  v.test_type, v.brand, v.model, v.band, v.firmware_version, v.avg_throughput,
  (100 * v.avg_throughput / b.avg_throughput)::decimal(8,2) - 100 as percent_gain
  from myview v
join (select test_type, brand, model, band,
      avg_throughput, rank() OVER (PARTITION BY test_type, brand, model, band
      order by firmware_version) as rank
      from myview) b
on v.test_type = b.test_type
and v.brand = b.brand
and v.model = b.model
and v.band = b.band
and rank = 1

See SQLFiddle使用您的样本数据并产生预期的输出。

您可以使用相关子查询而不是连接来执行此操作,但性能会很差,因为每个必须执行一次此类查询。通过使用这样的连接,获取最小值的查询只执行一次。

答案 1 :(得分:2)

使用窗口函数可以很容易地解决这个问题:

select test_type, 
       brand, 
       model, 
       band, 
       firmware_version, 
       avg_throughput,
       ((avg_throughput / first_value(avg_throughput) over (partition by brand, model order by firmware_version)) - 1) * 100 as diff_to_first_version
from temp_table
order by model desc, firmware_version;

您还可以使用lag()代替first_value()

将差异添加到以前的版本,而不仅仅是第一个版本
select test_type, 
       brand, 
       model, 
       band, 
       firmware_version, 
       avg_throughput,
       ((avg_throughput / first_value(avg_throughput) over (partition by brand, model order by firmware_version)) - 1) * 100 as diff_to_first_version,
       ((avg_throughput / lag(avg_throughput) over (partition by brand, model order by firmware_version)) - 1) * 100 as diff_to_prev_version
from temp_table
order by model desc, firmware_version;

SQLFiddle示例:http://sqlfiddle.com/#!15/9746f/1

这比使用桌面上的自联接的解决方案要快。

答案 2 :(得分:0)

经过一些帮助here我提出了这个我需要的查询。

 SELECT
   v.test_type, v.brand, v.model, v.band, v.firmware_version, v.avg_throughput,
   ROUND((100 * v.avg_throughput / (CASE b.min_avg WHEN 0 THEN NULL ELSE b.min_avg END)) - 100::numeric, 2) AS percentage
FROM temp_table v
JOIN (SELECT DISTINCT ON (test_type, model) 
test_type, brand, model, band, firmware_version, avg_throughput AS min_avg
FROM temp_table
ORDER BY test_type, model, firmware_version) b
    ON v.test_type = b.test_type
    AND v.brand = b.brand
    AND v.model = b.model
    AND v.band = b.band;

感谢大家的帮助!