我已经在我的数据中找到了一些重复项(基于V1,V2,V3,V4),但由于手动查看太多记录变得越来越困难,我想根据几个条件分配排名:
等等。怎么能实现呢?
T1
ID | V1 | V2 | V5 | V6 | CreatedDate
---| --- | --- --- --- ----------
1 | A | US | 1984 | QR | 01-AUG-2017
2 | B | FR | 1991 | TY | 01-JAN-2017
3 | C | AU | 1989 | GH | 25-SEP-2017
4 | B | FR | 1995 | BN | 01-AUG-2017
5 | A | US | 1984 | QR | 30-MAR-2016
6 | C | AU | 1999 | MK | 14-JUN-2015
T2
ID | V3 | V7
---| --- ---
1 | Apple D12
1 | Kiwi S45
2 | Pear T23
3 | Banana U78
4 | Pear T23
5 | Apple D12
6 | Banana P90
T3
ID | V4 V8
---| --- ---
1 | Spinach A678
1 | Beets V902
2 | Celery T456
3 | Radish Y675
4 | Celery T456
5 | Spinach G890
6 | Celery F567
6 | Radish R453
1 A US Apple Spinach 1984 QR D12 A678 1
5 A US Apple Spinach 1984 QR D12 G890 1
2 B FR Pear Celery 1991 TY T23 T456 2
4 B FR Pear Celery 1995 BN T23 T456 2
答案 0 :(得分:0)
看起来你的基本问题只是关于T1。 T2和T3只是附加信息。
因为您的示例数据不清楚,我将只关注T1。
在这里你自己加入T1,看看列上是否存在相同的值。
SELECT TA.*, TB.*,
CASE WHEN TA.V5 = TB.V5 AND TA.V6 = TB.V6
THEN 'rank -1'
WHEN TA.V7 = TB.V7 AND TA.V8 = TB.V7
THEN 'rank -2'
END as ranking
FROM T1 as TA
JOIN T1 as TB
ON TA.ID < TB.ID
AND ( (TA.V5 = TB.V5 AND TA.V6 = TB.V6)
OR (TA.V7 = TB.V7 AND TA.V8 = TB.V7)
)