组合多个SELECT函数,其中两个都具有WHERE

时间:2013-02-08 18:08:18

标签: sql

我正在尝试将我的数据规范化,因为它是从Excel工作表中输入的。我拉数据的文件有一堆列sibling1_name,sibling1_age,sibling1_affected等最多4个兄弟,4个孩子,4个亲戚等。我想把它全部输入到一个名字,年龄,受影响和关系的新表中。我找到了正确输入第一个兄弟姐妹的方法(见下文),但我不确定如何添加其他兄弟。有什么建议吗?

INSERT INTO Family
            (ID,
             Name,
             Age,
             Affected,
             Relationship)
SELECT ExcelPatients.id,
       ExcelPatients.sibling1_name     AS Name,
       ExcelPatients.sibling1_age      AS Age,
       ExcelPatients.sibling1_affected AS Affected,
       "Sibling"
FROM   ExcelPatients
WHERE  (( ( ExcelPatients.Sibling1_name ) IS NOT NULL ))
       AND ExcelPatients.id NOT IN (SELECT DISTINCT ID  AND Name
                                    FROM   Family); 

1 个答案:

答案 0 :(得分:1)

INSERT INTO Family
            (ID,
             Name,
             Age,
             Affected,
             Relationship)

SELECoT ExcelPatients.id, ExcelPatients.sibling1_name AS Name, 
ExcelPatients.sibling1_age AS Age, 
ExcelPatients.sibling1_affected AS Affected, "Sibling"
FROM ExcelPatients
WHERE (((ExcelPatients.Sibling1_name) Is Not Null))
AND NOT EXISTS  (SELECT DISTINCT ID FROM Family where family.id =  ExcelPatients.id and Family.name =  ExcelPatients.sibling1_name)

UNION

SELECT ExcelPatients.id, ExcelPatients.sibling2_name AS Name, 
ExcelPatients.sibling2_age AS Age, 
ExcelPatients.sibling2_affected AS Affected, "Sibling"
FROM ExcelPatients
WHERE (((ExcelPatients.Sibling2_name) Is Not Null))
AND NOT EXISTS  (SELECT DISTINCT ID FROM Family where family.id =  ExcelPatients.id and Family.name =  ExcelPatients.sibling2_name)

UNION

SELECT ExcelPatients.id, ExcelPatients.sibling3_name AS Name, 
ExcelPatients.sibling3_age AS Age, 
ExcelPatients.sibling3_affected AS Affected, "Sibling"
FROM ExcelPatients
WHERE (((ExcelPatients.Sibling3_name) Is Not Null))
AND NOT EXISTS  (SELECT DISTINCT ID FROM Family where family.id =  ExcelPatients.id and Family.name =  ExcelPatients.sibling3_name)

UNION 

SELECT ExcelPatients.id, ExcelPatients.sibling4_name AS Name, 
ExcelPatients.sibling4_age AS Age, 
ExcelPatients.sibling4_affected AS Affected, "Sibling"
FROM ExcelPatients
WHERE (((ExcelPatients.Sibling4_name) Is Not Null))
AND NOT EXISTS  (SELECT DISTINCT ID FROM Family where family.id =  ExcelPatients.id and Family.name =  ExcelPatients.sibling4_name)

如果没有看到数据,我不知道UNION ALL或UNION是否是正确的选择。如果名称只能在4个兄弟列中的一个中,则使用UNION ALL,如果可以重复,则使用UNION UNION。由于您正在清理来自其他来源的数据,因此UNION可能更安全但选择更慢。 NOT EXISTS往往是SQL Server中最快的比较,这就是我选择它的原因。