Oracle 11g。我正在研究多个表连接的查询,其中结果有一些我似乎无法消除的重复值。主表是TABLE1,我在那里提取所有数据。我必须在同一个表中进行连接以提取TYPE 1和2数据以将其展平。然后我正在做一个简单的连接,从table2获取LOC_ID的一些数据。然后第三次加入以获得更多数据。但是,在第三个表中,我可以通过查看标记为INVT_PRI的列来消除副本。问题是我只想查看该列,如果我有一个副本。另外我不想将我的决定基于该列,因为它有不同的含义。几千个记录中有少于10个记录。这是soem表数据:
TABLE1 TABLE2
LOC_ID INVT_CD CNT_CD TYPE LOC_ID LOC_NAME LOC_ADDR
111 AA US 1 111 PRIMUSA 123 Main
111 BB US 1 112 SECUSA 117 Northern
111 11133 US 2 113 ELVUSA 222 Southern
112 CC US 1 114 DISTAEU 134 Elaveen
112 11233 US 2
113 FF US 1
113 55555 US 2
114 LL EU 1
114 MM EU 1
114 22222 EU 2
TABLE3
LOC_ID INVT_CD PHS_IND ILW_IND INVT_PRI
111 AA S S BB
111 BB S S BB
112 CC S S CC
113 FF S S Z
114 LL S S LL
114 MM S S LL
这是我到目前为止的SQL ...这让我加入了我需要的所有数据,除了我没有处理这些重复项:
SELECT distinct
a.LOC_ID,
a.INVT_CD PSEUDO,
a.CNT_CD,
c.LOC_NAME,
c.LOC_ADDR,
b.INVT_CD DOMAIN,
d.PHS_IND,
d.ILW_IND,
d.INVT_PRI
FROM TABLE1 a
LEFT OUTER JOIN TABLE1 b ON a.LOC_ID = b.LOC_ID AND TYPE = 2
LEFT OUTER JOIN TABLE2 c ON c.LOC_ID = a.LOC_ID
LEFT OUTER JOIN TABLE3 d ON d.LOC_ID = a.LOC_ID
WHERE a.TYPE = 1;
结果看起来像:
LOC_ID PSEUDO CNT_CD LOC_NAME LOC_ADDR DOMAIN PHS_IND ILW_IND INVT_PRI
111 AA US PRIMUSA 123 Main 11133 S S BB
111 BB US PRIMUSA 123 Main 11133 S S BB
112 CC US SECUSA 117 Northern 11233 S S CC
113 FF US ELVUSA 222 Southern 55555 S S Z
114 LL EU DISTAEU 134 Elaveen 22222 S S LL
114 MM EU DISTAEU 134 Elaveen 22222 S S LL
但是,我想从结果中删除重复的LOC_ID,并将行保留在PSEUDO = INVT_PRI的位置,并丢弃另一行。正如我所提到的,我不能总是为此目的使用INVT_PRI ...只有当我检测到重复时。
如果您知道一个简单的解决方案而不参加程序,我感谢您的时间和帮助。
答案 0 :(得分:2)
这是一个优先级,您可以使用row_number()
执行此操作。为方便起见,我将您的查询放在CTE中:
with t as (
SELECT a.LOC_ID, a.INVT_CD PSEUDO, a.CNT_CD, c.LOC_NAME, c.LOC_ADDR,
b.INVT_CD DOMAIN, d.PHS_IND, d.ILW_IND, d.INVT_PRI
FROM TABLE1 a LEFT OUTER JOIN
TABLE1 b
ON a.LOC_ID = b.LOC_ID AND TYPE = 2 LEFT OUTER JOIN
TABLE2 c
ON c.LOC_ID = a.LOC_ID LEFT OUTER JOIN
TABLE3 d
ON d.LOC_ID = a.LOC_ID
WHERE a.TYPE = 1
)
select t.*
from (select t.*,
row_number() over (partition by loc_id
order by (case when PSEUDO = INVT_PRI then 1 else 2 end)
) as seqnum
from t
) t
where seqnum = 1;
这将为每个loc_id
保留一行。根据偏好,它将是这两个值匹配的那个。如果没有匹配,那么它将选择另一行。
答案 1 :(得分:1)
小提琴测试: http://sqlfiddle.com/#!4/fc31a5/5/0
with sub as
(SELECT distinct a.LOC_ID,
a.INVT_CD PSEUDO,
a.CNT_CD,
c.LOC_NAME,
c.LOC_ADDR,
b.INVT_CD DOMAIN,
d.PHS_IND,
d.ILW_IND,
d.INVT_PRI
FROM TABLE1 a
LEFT OUTER JOIN TABLE1 b
ON a.LOC_ID = b.LOC_ID
AND b.TYPE = 2
LEFT OUTER JOIN TABLE2 c
ON c.LOC_ID = a.LOC_ID
LEFT OUTER JOIN TABLE3 d
ON d.LOC_ID = a.LOC_ID
WHERE a.TYPE = 1),
dup as
(select loc_id from sub group by loc_id having count(*) > 1)
select sub.*
from sub
left join dup
on sub.loc_id = dup.loc_id
where (sub.pseudo = sub.invt_pri and dup.loc_id is not null)
or dup.loc_id is null
order by sub.loc_id