如何获得不同的群组

时间:2016-10-05 10:06:17

标签: sql-server group-by distinct

我有一张像

这样的表格
name_id     disease_id   
-------     ----------    
1           1    
1           2    
2           2    
3           1    
3           3    
4           1    
4           2    
5           2    

我需要在整个表格中找到疾病的独特组合。我不能按name_id进行分组,因为这会产生非不同的组合(请参阅name_id 1和4)。我不能省略GROUP BY,它会在所有name_id中生成一个简单的disease_id列表。

我希望能够回答的问题是:人群中发生的儿童疾病的独特组合是什么(以后可能会增加计数:

Chickenpox          (10)
Chickenpox+Measles  (2)
Measles             (5)
Measles+Mumps       (1)
etc.

2 个答案:

答案 0 :(得分:0)

你希望能够回答像#34这样的问题:人群中发生的儿童疾病的独特组合是什么?#34;

我的回答基于以下假设:

  1. disease_id指水痘,麻疹,腮腺炎等
  2. name_id指病人
  3. disease_id 1指'水痘'和疾病_id 3指麻疹&#39 ;;两者都被认为是儿童疾病。 (这只是例如,你的可以不同)
  4. 基于以上假设,下面的查询会产生所需的结果:

    SELECT 'Chickenpox' as 'DiseaseName', COUNT(*) as 'Patients' 
    FROM (
        SELECT name_id FROM table GROUP BY name_id HAVING COUNT(*) = 1 WHERE disease_id  = 1
        ) d1
    UNION 
    SELECT 'Measles' as 'DiseaseName', COUNT(*) as 'Patients' 
    FROM (
        SELECT name_id FROM table GROUP BY name_id HAVING COUNT(*) = 1 WHERE disease_id = 3
        ) d3
    UNION
    SELECT 'Chickenpox + Measles' as 'DiseaseName', COUNT(*) as 'Patients' 
    FROM (
        SELECT name_id FROM table GROUP BY name_id HAVING COUNT(*) = 2 WHERE disease_id IN (1,3) 
        ) d1d3
    

    结果将如下:

    DiseaseName           Patients
    --------------------  ---------
    Chickenpox            10
    Measles               5
    Chickenpox+Measels    2
    

答案 1 :(得分:0)

解决方案有两个步骤。

  1. 列举每位患者患有的疾病;将其存储在一个单独的表中
  2. 使用上表列出疾病的不同组合以及患有这些组合的患者数量。
  3. 现在,对于步骤#1,您需要一个存储过程,如下所示:

    DECLARE @prv int
    DECLARE @nid int
    DECLARE @dname varchar(100)
    DECLARE @combi varchar(500)
    
    DECLARE c1 CURSOR FOR
    SELECT name_id, disease_name 
    FROM patients 
    JOIN diseases ON patients.disease_id = diseases.disease_id
    ORDER BY name_id, disease_name;
    
    DELETE FROM diseasecombi;
    
    OPEN c1
    FETCH NEXT FROM c1 INTO @nid, @dname 
    SET @prv = @nid
    SET @combi = ''
    
    WHILE @@FETCH_STATUS = 0
    BEGIN
        IF @prv <> @nid
        BEGIN
            INSERT INTO diseasecombi (name_id, suffers) VALUES (@prv, @combi)
            SET @prv = @nid
            SET @combi = ''
        END
        IF LEN(@combi) > 0 SET @combi = @combi + ', '
        SET @combi = @combi + @dname
        FETCH NEXT FROM c1 INTO @nid, @dname
    END
    INSERT INTO diseasecombi(name_id, suffers) VALUES (@prv, @combi)
    
    CLOSE c1
    DEALLOCATE c1
    

    上面的存储过程将生成一个看起来像

    的表
    name_id suffers
    ------- -------------------------------
    1       Chickenpox, Mumps
    2       Chickenpox, Fibroids, Measles
    3       Chickenpox, Mumps
    4       Chickenpox, Measles
    5       Chickenpox, Measles
    6       Chickenpox
    7       Rashes
    

    对于步骤#2,查询如下:

    SELECT suffers, COUNT(*) AS Patients 
    FROM diseasecombi
    GROUP BY suffers
    ORDER BY suffers
    

    产生输出如下:

    Diseases                        Patients
    ------------------------------  ----------
    Chickenpox                      1
    Chickenpox, Fibroids, Measles   1
    Chickenpox, Measles             2
    Chickenpox, Mumps               2
    Rashes                          1