连接重复行时的总和(SQL)

时间:2016-12-06 23:31:16

标签: mysql sql

我的查询结果如下:

SELECT ... ON CIA_factbook_dataset.my_name = World_Bank_dataset.my_name ...

+----------------+------+-------------+-----------------+---------+--------+
| my_name        | Year | CIA_name    | World_Bank_name | CIA_GDP | WB_GDP |
+----------------+------+-------------+-----------------+---------+--------+
| United Kingdom | 2010 | UK          | United Kingdom  | 2850    | 2800   |
| United Kingdom | 2010 | UK          | Channel Islands | 2850    |   11   |
| Cyprus         | 2010 | CYPRUS TURK | CYPRUS TURK     |   22    |   22   |
| Cyprus         | 2010 | CYPRUS TURK | CYPRUS GRK      |   22    |   33   |
| Cyprus         | 2010 | CYPRUS GRK  | CYPRUS TURK     |   33    |   22   |
| Cyprus         | 2010 | CYPRUS GRK  | CYPRUS GRK      |   33    |   33   |
+----------------+------+-------------+-----------------+---------+--------+

我需要计算子国家/地区数据的总和,但如果我只使用GROUP BY my_name,year,它会计算几次相同数字的总和。

最终结果应为:

+----------------+------+---------+--------+
| my_name        | Year | CIA_GDP | WB_GDP |
+----------------+------+---------+--------+
| United Kingdom | 2010 | 2850    | 2811   |
| Cyprus         | 2010 |   55    |   55   |
+----------------+------+---------+--------+

而不是:

+----------------+------+---------+--------+
| my_name        | Year | CIA_GDP | WB_GDP |
+----------------+------+---------+--------+
| United Kingdom | 2010 | 5700    | 2811   |
| Cyprus         | 2010 |  110    |  110   |
+----------------+------+---------+--------+

如何实现?
有没有比使用SUM(distinct CIA_GDP),SUM(distinct WB_GDP)更好的方法? (从理论上讲,土耳其塞浦路斯和希腊塞浦路斯的国内生产总值可能相同)

2 个答案:

答案 0 :(得分:2)

SQL Fiddle

MySQL 5.6架构设置

CREATE TABLE t
    (`my_name` varchar(14), `Year` int, `CIA_name` varchar(11), `World_Bank_name` varchar(15), `CIA_GDP` int, `WB_GDP` int)
;

INSERT INTO t
    (`my_name`, `Year`, `CIA_name`, `World_Bank_name`, `CIA_GDP`, `WB_GDP`)
VALUES
    ('United Kingdom', 2010, 'UK', 'United Kingdom', 2850, 2800),
    ('United Kingdom', 2010, 'UK', 'Channel Islands', 2850, 11),
    ('Cyprus', 2010, 'CYPRUS TURK', 'CYPRUS TURK', 22, 22),
    ('Cyprus', 2010, 'CYPRUS TURK', 'CYPRUS GRK', 22, 33),
    ('Cyprus', 2010, 'CYPRUS GRK', 'CYPRUS TURK', 33, 22),
    ('Cyprus', 2010, 'CYPRUS GRK', 'CYPRUS GRK', 33, 33)
;

查询1

SELECT my_name, Year, SUM(CIA_GDP), WB_GDP
FROM (
  SELECT my_name, Year, CIA_GDP, SUM(WB_GDP) WB_GDP
  FROM t
  GROUP BY my_name, Year, CIA_GDP
  ) t1
GROUP BY my_name, Year, WB_GDP

<强> Results

|        my_name | Year | SUM(CIA_GDP) | WB_GDP |
|----------------|------|--------------|--------|
|         Cyprus | 2010 |           55 |     55 |
| United Kingdom | 2010 |         2850 |   2811 |

答案 1 :(得分:1)

为此,我假设my_nameYear在两个表中都是唯一的。

SQL Fiddle

SELECT t1.my_name, t1.YEAR, SUM_CIA_GDP, SUM_WB_GDP
FROM (
    SELECT DISTINCT my_name, YEAR, SUM(CIA_GDP) AS SUM_CIA_GDP
    FROM t
    GROUP BY my_name, YEAR, WB_GDP
    ) t1
JOIN (   
    SELECT DISTINCT my_name, YEAR, SUM(WB_GDP) AS SUM_WB_GDP
    FROM t
    GROUP BY my_name, YEAR, CIA_GDP
    ) t2 
    ON t1.my_name = t2.my_name 
        AND t1.YEAR = t2.YEAR

<强> Results

|        my_name | YEAR | SUM_CIA_GDP | SUM_WB_GDP |
|----------------|------|-------------|------------|
|         Cyprus | 2010 |          55 |         55 |
| United Kingdom | 2010 |        2850 |       2811 |