SQL:在所有EXCEPT指定列中防止重复

时间:2017-08-02 20:25:18

标签: sql sql-server tsql

我有一个奇怪的问题:我希望加入两个联盟。我不想要任何重复的列,但是每个表都有一个额外的变量我写入SELECT子句,指定它们的来源:

SELECT EOU.EventSessionKey, EOU.EventSourceID, EOU.EventSeq, 
    EOU.EventColumnName, EOU.ProSymbol,
    EOU.ProKey, tP.isActive, tP.Description, 
    'POSITIVE' AS ErrType FROM EOU
LEFT JOIN dbRunoff.dbo.tPro tP 
    ON tP.Symbol COLLATE DATABASE_DEFAULT 
        = EOU.ProSymbol COLLATE DATABASE_DEFAULT
WHERE tP.IsActive = 1 AND (EOU.ProKey IS NULL OR  EOU.ProKey <= 0)

UNION

SELECT EOU.EventSessionKey, EOU.EventSourceID, EOU.EventSeq, 
    EOU.EventColumnName, EOU.ProSymbol,
    EOU.ProKey, tP.isActive, tP.Description, 
    'DUPLICATE' AS ErrType FROM EOU
LEFT JOIN dbRunoff.dbo.tPro tP 
    ON tP.Symbol COLLATE DATABASE_DEFAULT 
        = EOU.ProSymbol COLLATE DATABASE_DEFAULT
JOIN ProFilter PF ON PF.ProKey = tp.ProKey
WHERE tP.IsActive = 1 AND (EOU.ProKey IS NULL OR  EOU.ProKey <= 0))

我担心最后添加的文本变量会阻止Union函数正确删除重复值。是否有任何(简单/有效)方法确保联合删除表之间的重复值但忽略了文本变量,否则会导致联合不读取重复项?

请注意,我正在寻找简单的方法来执行此操作。我知道一些粗暴的方法,但效率和速度是关注点。

2 个答案:

答案 0 :(得分:3)

由于您需要重复版本,因此可以将MIN()CTE一起使用。 DP之前出现,因此当正面存在时,联接将使用重复

只是旁注,我将在没有UNION的情况下重写此内容。

with cte as(
SELECT EOU.EventSessionKey, EOU.EventSourceID, EOU.EventSeq, 
    EOU.EventColumnName, EOU.ProSymbol,
    EOU.ProKey, tP.isActive, tP.Description, 
    'POSITIVE' AS ErrType FROM EOU
LEFT JOIN dbRunoff.dbo.tPro tP 
    ON tP.Symbol COLLATE DATABASE_DEFAULT 
        = EOU.ProSymbol COLLATE DATABASE_DEFAULT
WHERE tP.IsActive = 1 AND (EOU.ProKey IS NULL OR  EOU.ProKey <= 0)

UNION

SELECT EOU.EventSessionKey, EOU.EventSourceID, EOU.EventSeq, 
    EOU.EventColumnName, EOU.ProSymbol,
    EOU.ProKey, tP.isActive, tP.Description, 
    'DUPLICATE' AS ErrType FROM EOU
LEFT JOIN dbRunoff.dbo.tPro tP 
    ON tP.Symbol COLLATE DATABASE_DEFAULT 
        = EOU.ProSymbol COLLATE DATABASE_DEFAULT
JOIN ProFilter PF ON PF.ProKey = tp.ProKey
WHERE tP.IsActive = 1 AND (EOU.ProKey IS NULL OR  EOU.ProKey <= 0))

select distinct
    EventSessionKey
    ,EventSourceID
    ,EventSeq
    ,EventColumnName
    ,ProSymbol
    ,ProKey
    ,isActive
    ,Description
    ,min(cte2.ErrType)
from
    cte
    left join
    cte2 on cte.EventSessionKey = cte2.SessionKey --assumption based of column name. Use correct key to join.
group by
    EventSessionKey
    ,EventSourceID
    ,EventSeq
    ,EventColumnName
    ,ProSymbol
    ,ProKey
    ,isActive
    ,Description

修改

根据我在查询中看到的内容,这应该会给你相同的东西......基本上我将INNER JOIN ProFilter更改为左连接。如果此加入结果为TRUE,那么根据您的UNION,它将是重复记录。如果不存在,则肯定

SELECT 
    EOU.EventSessionKey, 
    EOU.EventSourceID, 
    EOU.EventSeq, 
    EOU.EventColumnName, 
    EOU.ProSymbol,
    EOU.ProKey, 
    tP.isActive, 
    tP.Description, 
    case when pf.ProKey is null then 'Positive' else 'Duplicate' end as ErrType
FROM 
    EOU
LEFT JOIN 
    dbRunoff.dbo.tPro tP 
    ON tP.Symbol COLLATE DATABASE_DEFAULT = EOU.ProSymbol COLLATE DATABASE_DEFAULT
LEFT JOIN 
    ProFilter PF 
    ON PF.ProKey = tp.ProKey
WHERE 
    tP.IsActive = 1 
    AND (EOU.ProKey IS NULL OR  EOU.ProKey <= 0)

答案 1 :(得分:0)

  

嗨,乔什,我不是专家,但是我认为您可以尝试将查询转储到派生表中,然后选择min(ErrType),然后按其余列名进行分组。

select q.EventSessionKey, q.EventSourceID, q.EventSeq, q.EventColumnName, q.ProSymbol,
q.ProKey, q.isActive, q.Description, min(q.ErrType) as [ErrType] from
(
SELECT EOU.EventSessionKey, EOU.EventSourceID, EOU.EventSeq, 
EOU.EventColumnName, EOU.ProSymbol,
EOU.ProKey, tP.isActive, tP.Description, 
'POSITIVE' AS ErrType FROM EOU
LEFT JOIN dbRunoff.dbo.tPro tP 
ON tP.Symbol COLLATE DATABASE_DEFAULT 
    = EOU.ProSymbol COLLATE DATABASE_DEFAULT
WHERE tP.IsActive = 1 AND (EOU.ProKey IS NULL OR  EOU.ProKey <= 0)

UNION

SELECT EOU.EventSessionKey, EOU.EventSourceID, EOU.EventSeq, 
EOU.EventColumnName, EOU.ProSymbol,
EOU.ProKey, tP.isActive, tP.Description, 
'DUPLICATE' AS ErrType FROM EOU
LEFT JOIN dbRunoff.dbo.tPro tP 
ON tP.Symbol COLLATE DATABASE_DEFAULT 
    = EOU.ProSymbol COLLATE DATABASE_DEFAULT
JOIN ProFilter PF ON PF.ProKey = tp.ProKey
WHERE tP.IsActive = 1 AND (EOU.ProKey IS NULL OR  EOU.ProKey <= 0)) as q