在PostgreSQL中对类似的行和计数组进行分组

时间:2018-06-13 09:46:20

标签: sql postgresql gaps-and-islands

我有这样一张桌子:

number | info | side
--------------------
     1 |  foo |    a
     2 |  bar |    a
     3 |  bar |    a
     4 |  baz |    a
     5 |  foo |    a
     6 |  bar |    b
     7 |  bar |    b
     8 |  foo |    a
     9 |  bar |    a
    10 |  baz |    a

我希望获得bar组/包(例如,第2,3行是一组,第6,7行是一组,第9行也是一组)的次数info列取决于side。我被困了,因为我不知道谷歌是做什么的。每当我搜索group rowsmerge rows之类的内容时,我总会找到有关group by功能的信息。

但我认为我需要某种窗口功能。

以下是我想要实现的目标:

bar_a | bar_b
-------------
    2 |     1

3 个答案:

答案 0 :(得分:3)

使用symptoms = ['cough', 'fever']确定第一组组:

lag()

汇总并过滤上述结果以获得所需的输出:

select 
    number, info, side, 
    lag(info || side, 1, '') over (order by number) <> info || side as start_of_group
from my_table
order by 1;

 number | info | side | start_of_group 
--------+------+------+----------------
      1 | foo  | a    | t
      2 | bar  | a    | t
      3 | bar  | a    | f
      4 | baz  | a    | t
      5 | foo  | a    | t
      6 | bar  | b    | t
      7 | bar  | b    | f
      8 | foo  | a    | t
      9 | bar  | a    | t
     10 | baz  | a    | t
(10 rows)

答案 1 :(得分:2)

这是一个&#34; gap-and-islands&#34;问题,从本质上说,如果我理解正确的话。对于这个版本,行号的差异应该很好。

select sum( (side = 'a')::int) as num_a,
       sum( (side = 'b')::int) as num_b
from (select info, side, count(*) as cnt
      from (select t.*,
                   row_number() over (order by number) as seqnum,
                   row_number() over (partition by info, side order by number) as seqnum_bs
            from t
           ) t
      where info = 'bar'
      group by info, size, (seqnum - seqnum_bs)
     ) si;

答案 2 :(得分:2)

您可以使用单个窗口功能,这应该是最快的选项:

SELECT side, count(*) AS count
FROM  (
   SELECT side, grp
   FROM  (
      SELECT side, number - row_number() OVER (PARTITION BY side ORDER BY number) AS grp
      FROM   tbl
      WHERE  info = 'bar'
      ) sub1
   GROUP BY 1, 2
   ) sub2
GROUP BY 1
ORDER BY 1;  -- optional

或更短,也许不会更快:

SELECT side, count(DISTINCT grp) AS count
FROM  (
   SELECT side, number - row_number() OVER (PARTITION BY side ORDER BY number) AS grp
   FROM   tbl
   WHERE  info = 'bar'
   ) sub
GROUP BY 1
ORDER BY 1;  -- optional

“技巧”是形成组(grp)的相邻行具有连续数字。从side上的所有行(number)的运行计数中减去grp上的分区的运行计数时,“组”的成员将获得相同的number个数字。

如果您的序列列中存在缺口 row_number() OVER (ORDER BY number),这在您的演示中并非如此,但通常存在差距(您实际上想要忽略这些差距?!),然后在子查询中使用number而不是仅使用SELECT side, count(DISTINCT grp) AS count FROM ( SELECT side, number - row_number() OVER (PARTITION BY side ORDER BY number) AS grp FROM (SELECT info, side, row_number() OVER (ORDER BY number) AS number FROM tbl) tbl1 WHERE info = 'bar' ) sub2 GROUP BY 1 ORDER BY 1; -- optional 来缩小差距:

public String encrypt(String plainText) throws Exception {

        byte[] cipherBytes = null;

        log.info("Started encryption...");

        System.out.println("value before encryption :" + plainText);

        log.info("value before encryption :" + plainText);

        if (plainText != null && !plainText.isEmpty()) {
            if (cipher != null && key != null) {
                byte[] ivByte = new byte[cipher.getBlockSize()];
                IvParameterSpec ivParamsSpec = new IvParameterSpec(ivByte);
                cipher.init(Cipher.ENCRYPT_MODE, key, ivParamsSpec);
                cipherBytes = cipher.doFinal(plainText.getBytes());
                log.info("Completed encryption.");
                log.info("Encrypted data : " + new String(cipherBytes, "UTF8"));
                System.out.println("value after encryption" + Hex.encodeHexString(cipherBytes));
                log.info("value after encryption" + Hex.encodeHexString(cipherBytes));
                return Hex.encodeHexString(cipherBytes);
            } else {
                log.info("Encryption failed, cipher, key is null.");
                throw new RuntimeException(
                        "Encryption failed, cipher, key  is null.");
            }

        }


        return plainText;


    }

SQL Fiddle(扩展测试用例)

相关: