Question

我有一个包含重复条目的表，目的是根据最新的时间戳获取不同的条目。

在我的情况下，'serial_no'会有重复的条目，但我会根据最新的时间戳选择唯一的条目。

以下查询给出了带有最新时间戳的独特结果。但我担心的是我需要获得唯一条目的总数。

例如假设我的表总共有40个条目。通过以下查询，我可以根据序列号获得20个唯一行。但'total'返回40而不是20。对此有何帮助？

  SELECT 
  * 
  FROM 
  (
    SELECT 
      DISTINCT ON (serial_no) id, 
      serial_no, 
      name, 
      timestamp,
      COUNT(*) OVER() as total 
    FROM 
      product_info 
      INNER JOIN my.account ON id = accountid 
    WHERE 
      lower(name) = 'hello' 
    ORDER BY 
      serial_no, 
      timestamp DESC OFFSET 0 
    LIMIT 
      10
  ) AS my_info 
 ORDER BY 
   serial_no asc

product_info table intially has this data  

serial_no           name         timestamp                              

11212               pulp12      2018-06-01 20:00:01             
11213               mango       2018-06-01 17:00:01             
11214               grapes      2018-06-02 04:00:01             
11215               orange      2018-06-02 07:05:30             
11212               pulp12      2018-06-03 14:00:01             
11213               mango       2018-06-03 13:00:00             



After the distict query I got all unique results based on the latest 
timestamp:

serial_no       name        timestamp                   total

11212           pulp12     2018-06-03 14:00:01            6
11213           mango      2018-06-03 13:00:00            6
11214           grapes     2018-06-02 04:00:01            6
11215           orange     2018-06-02 07:05:30            6


But total is appearing as 6 . I wanted the total to be 4 since it has 
only 4 unique entries.

I am not sure how to modify my existing query to get this desired 
result.

Answer 1

您可以做的是将窗口函数移动到更高级别的select语句。这是因为在应用distinct on和limit子句之前评估窗口函数。此外，您不能在窗口函数中包含DISTINCT关键字 - 它尚未实现（截至Postgres 9.6）。

 SELECT 
  *,
  COUNT(*) OVER() as total -- here
 FROM 
  (
    SELECT 
      DISTINCT ON (serial_no) id, 
      serial_no, 
      name, 
      timestamp
    FROM 
      product_info 
      INNER JOIN my.account ON id = accountid 
    WHERE 
      lower(name) = 'hello' 
    ORDER BY 
      serial_no, 
      timestamp DESC
    LIMIT 
      10
  ) AS my_info

此外，不需要偏移，再多一次排序也是多余的。我删除了这些。

另一种方法是在select子句中包含一个计算列，但这不会像需要再扫描一次表那样快。这显然是假设您的总数与结果集严格相关，而不是存储在表格中的内容，但会过滤掉。

Answer 2

select count(*), serial_no from product_info group by serial_no

将为您提供每个序列号重复的数量

合并该信息的最无意识的方式是加入子查询

  SELECT 
  * 
  FROM 
  (
    SELECT 
      DISTINCT ON (serial_no) id, 
      serial_no, 
      name, 
      timestamp,
      COUNT(*) OVER() as total 
    FROM 
      product_info 
      INNER JOIN my.account ON id = accountid 
    WHERE 
      lower(name) = 'hello' 
    ORDER BY 
      serial_no, 
      timestamp DESC OFFSET 0 
    LIMIT 
      10
  ) AS my_info
  join (select count(*) as counts, serial_no from product_info group by serial_no) as X
  on X.serial_no = my_info.serial_no
 ORDER BY 
   serial_no asc

Answer 3

Postgres支持COUNT(DISTINCT column_name)，因此，如果我理解了您的请求，那么使用COUNT(*)代替OVER即可，您可以放弃options_init(){ if(current_user_can('manage_options') && isset($_POST['form_submitted'])): $hidden = esc_html($_POST['form_submitted']); if( $hidden == 'Y'){ $brand = $_POST['brand_color']; update_option('brand_color', $brand); } endif }。

Postgres：需要不同的记录数

3 个答案: