SQL分组依据并根据其他列中的不同值求和(如果重复其他列中的值,则求和一次)

时间:2020-09-12 01:33:29

标签: sql group-by sum teradata distinct

我需要有关分组查询的帮助。我的桌子看起来像这样:

CREATE MULTISET TABLE MY_TABLE (PERSON CHAR(1), ITEM CHAR(1), COST INT);
INSERT INTO MY_TABLE VALUES ('A', '1', 5);
INSERT INTO MY_TABLE VALUES ('A', '1', 5);
INSERT INTO MY_TABLE VALUES ('A', '2', 1);
INSERT INTO MY_TABLE VALUES ('B', '3', 0);
INSERT INTO MY_TABLE VALUES ('B', '4', 10);
INSERT INTO MY_TABLE VALUES ('B', '4', 10);
INSERT INTO MY_TABLE VALUES ('C', '5', 1);
INSERT INTO MY_TABLE VALUES ('C', '5', 1);
INSERT INTO MY_TABLE VALUES ('C', '5', 1);
+--------+------+------+
| PERSON | ITEM | COST |
+--------+------+------+
| A      | 1    |    5 |
| A      | 1    |    5 |
| A      | 2    |    1 |
| B      | 3    |    0 |
| B      | 4    |   10 |
| B      | 4    |   10 |
| C      | 5    |    1 |
| C      | 5    |    1 |
| C      | 5    |    1 |
+--------+------+------+

我需要按人对项目和成本进行分组,但方式不同。对于每个人,我需要他们拥有的独特商品数量。例如:人A有两个不同的项目,项目1和项目2。我可以通过COUNT(DISTINCT ITEM)来获得。

然后,对于每个人,我需要对费用进行总计,但每个不同项目只需一次(对于重复的项目,费用始终相同)。例如:人A的商品1为5美元,商品1为5美元,商品2为1美元。由于此人两次拥有商品1,因此我一次计算了5美元,然后从商品2中添加了1美元,总计为6美元。输出应如下所示:

+--------+---------------------+------------------------+
| PERSON | ITEM_DISTINCT_COUNT | COST_DISTINCT_ITEM_SUM |
+--------+---------------------+------------------------+
| A      |                   2 |                      6 |
| B      |                   2 |                     10 |
| C      |                   1 |                      1 |
+--------+---------------------+------------------------+

有没有一种简便的方法可以在很多行上表现良好?

SELECT PERSON
  ,COUNT(DISTINCT ITEM) ITEM_DISTINCT_COUNT
  -- help with COST_DISTINCT_ITEM_SUM
FROM MY_TABLE
GROUP BY PERSON

2 个答案:

答案 0 :(得分:1)

我建议两个层次的聚合:

select person, count(*) as num_items, sum(cost)
from (select person, item, avg(cost) as cost
      from my_table t
      group by person, item
     ) t
group by person;

答案 1 :(得分:1)

您可以创建一个子查询,为每个人获取import re s = "abc aab aaa acb acvd ccd bb bbb dsa dssd ssss" r = re.compile(r'\b(([a-z])\2\2)\b') print([i[0] for i in r.findall(s)]) item的不同值,然后对其进行汇总:

cost

输出:

SELECT PERSON, 
       COUNT(ITEM) AS ITEM_DISTINCT_COUNT,
       SUM(COST) AS COST_DISTINCT_ITEM_SUM 
FROM (
  SELECT DISTINCT PERSON, ITEM, COST
  FROM MY_TABLE
) M
GROUP BY PERSON

Demo on dbfiddle