在表格中,我想找到至少有2个字段(列)具有重复“非空”值的行。通用SQL解决方案将更受欢迎,因为它可以在任何数据库中使用。如果不是这样,Oracle和SQL Server就是我的目标数据库。作为一个例子
ID COL1 COL2 COL3 COL4
1 11 11 11 44
2 11 22 33 44
3 11 null 33 33
4 11 null null 44
应返回以下行
ID COL1 COL2 COL3 COL4
1 11 11 11 44
3 11 null 33 33
第一行有3个字段,重复值为11,其他行有col3,col4的重复值为33
答案 0 :(得分:2)
蛮力方法是:
select t.*
from t
where (col1 = col2 or col1 = col3 or col1 = col4 or
col2 = col3 or col2 = col4 or col3 = col4
) or
(col1 is null and (col2 is null or col3 is null or col4 is null) or
col2 is null and (col3 is null or col4 is null) or
col3 is null and col4 is null
)
这适用于任何数据库。
答案 1 :(得分:0)
您可以使用UNPIVOT
:
Oracle 11g R2架构设置:
CREATE TABLE table_name ( ID, COL1, COL2, COL3, COL4 ) As
SELECT 1, 11, 11, 11, 44 FROM DUAL UNION ALL
SELECT 2, 11, 22, 33, 44 FROM DUAL UNION ALL
SELECT 3, 11, null, 33, 33 FROM DUAL UNION ALL
SELECT 4, 11, null, null, 44 FROM DUAL;
查询1 :
SELECT *
FROM table_name
WHERE id IN (
SELECT id
FROM table_name
UNPIVOT ( value FOR key IN ( COL1, COL2, COL3, COL4 ) )
GROUP BY id, value
HAVING COUNT( DISTINCT key ) > 1
)
<强> Results 强>:
| ID | COL1 | COL2 | COL3 | COL4 |
|----|------|--------|------|------|
| 1 | 11 | 11 | 11 | 44 |
| 3 | 11 | (null) | 33 | 33 |
如果您想在NULL
上匹配,请使用UNPIVOT INCLUDE NULLS
。
和SQL Server一样,代码几乎相同(只需UNPIVOT
上的别名):
查询1 :
SELECT *
FROM table_name
WHERE id IN (
SELECT id
FROM table_name
UNPIVOT ( value FOR name IN ( COL1, COL2, COL3, COL4 ) ) AS u
GROUP BY id, value
HAVING COUNT( DISTINCT name ) > 1
)
<强> Results 强>:
| ID | COL1 | COL2 | COL3 | COL4 |
|----|------|--------|------|------|
| 1 | 11 | 11 | 11 | 44 |
| 3 | 11 | (null) | 33 | 33 |
<强>更新强>:
您还可以使用Oracle中的*_TAB_COLUMN
字典表生成强力查询(SQL服务器中可能存在等效项):
SELECT 'SELECT * FROM TABLE_NAME WHERE ('
|| LISTAGG(
'"' || PRIOR COLUMN_NAME || '" = "' || COLUMN_NAME || '"',
' OR '
) WITHIN GROUP ( ORDER BY ROWNUM )
|| ')' AS query
FROM USER_TAB_COLUMNS
WHERE TABLE_NAME = 'TABLE_NAME'
AND COLUMN_NAME LIKE 'COL%'
AND LEVEL = 2
START WITH COLUMN_NAME LIKE 'COL%'
CONNECT BY PRIOR COLUMN_ID < COLUMN_ID;
哪个输出:
SELECT * FROM TABLE_NAME WHERE ("COL1" = "COL2" OR "COL1" = "COL3" OR "COL1" = "COL4" OR "COL2" = "COL3" OR "COL2" = "COL4" OR "COL3" = "COL4")
答案 2 :(得分:0)
没有硬编码列名称的解决方案(Sql Server)。 可以说,我们的表是[#test]。然后我们的查询是:
;with [temp] as (
select
[id] = id
,[col_name1] = [c1].[value]('local-name(.)', 'nvarchar(256)')
,[col_value1] = [c1].[value]('.', 'nvarchar(256)')
,[col_name2] = [c2].[value]('local-name(.)', 'nvarchar(256)')
,[col_value2] = [c2].[value]('.', 'nvarchar(256)')
from
[#test] as [t]
cross apply
(
select [data] = convert(xml, (select [t].* for xml path('row') ))
) as [x]
cross apply
[x].[data].[nodes]('row/*') as [t1]([c1])
cross apply
[x].[data].[nodes]('row/*') as [t2]([c2])
)
,[ids] as (
select
[id]
from
[temp]
where
([col_name1] <> [col_name2] )
and ([col_value1] = [col_value2])
group by
[id]
)
select
*
from
[#test] as [t]
inner join
[ids] as [i]
on
[t].[id] = [i].[id];
可以找到完整查询:https://pastebin.com/jUG5r41c
答案 3 :(得分:0)
对于SQL Server,我会使用APPLY
运算符来执行此操作:
select *
from (select *, (select COUNT(*) from (values (Col1), (Col2),.. (ColN))t(ids)) TotalCols,
(select COUNT(distinct ids) from (values (Col1), (Col2),.. (ColN))t(ids)) DistinctCols
from table t
) t
where TotalIds <> DistinctCols;