关于重复的最后一个问题。我了解如何使用 COUNT(*) 和 HAVING 子句 > 1 选择重复记录,但我面临着在满足条件时删除重复项的挑战。
昨天我询问了其中的一部分,在账单金额取消时删除重复项,但现在我必须包含一个标准,其中当账单金额具有相同的正负值时,日期是两者和代码都一样。
例如,记录 1 的帐单金额为 250 美元,代码为“JUN”,日期为 03/02/2020,记录 2 的帐单金额为 250 美元,代码为“PII”,日期为 03/07 /2020 和记录 3 的帐单金额为 -$250,代码为“PII”,日期为 03/07/2020。我想在这个例子中看到的结果只是记录 1,其中记录 2 和 3 将被视为根据我所述的条件重复。
表创建:
CREATE TABLE Billing (
BillId varchar(10),
SerialNo varchar(10),
BillAmt MONEY,
Code varchar(5),
DispenseDt DATE
);
数据输入:
INSERT INTO Billing (BillId, SerialNo, BillAmt, Code, DispenseDt)
VALUES ('BL_001','aaa-111',250,'AAP','20200503')
,('BL_002','aab-112',250,'ADD','20200309')
,('BL_003','aab-112',-250,'ADD','20200309')
,('BL_004','aba-120',700,'YED','20200503')
,('BL_005','aba-120',370,'TPP','20200822')
,('BL_006','aba-120',370,'TPP','20201003')
,('BL_007','aba-120',400,'TPP','20200822')
,('BL_008','aba-120',-370,'TPP','20200822')
,('BL_009','aba-120',-700,'YED','20200503')
,('BL_010','baa-201',1000,'TOK','20200927')
,('BL_011','baa-201',-1000,'TOK','20200927')
,('BL_012','bab-210',1000,'TOK','20200927');
样本数据:
+----------+-----------+---------+------+------------+
| BillId | SerialNo | BillAmt | Code | DispenseDt |
+----------+-----------+---------+------+------------+
| BL_001 | aaa-111 | $250 | AAP | 20200503 |
| BL_002 | aab-112 | $250 | ADD | 20200309 |
| BL_003 | aab-112 |-$250 | ADD | 20200309 |
| BL_004 | aba-120 | $700 | YED | 20200503 |
| BL_005 | aba-120 | $370 | TPP | 20200822 |
| BL_006 | aba-120 | $370 | TPP | 20201003 |
| BL_007 | aba-120 | $400 | TPP | 20200822 |
| BL_008 | aba-120 |-$370 | TPP | 20200822 |
| BL_009 | aba-120 |-$700 | YED | 20200503 |
| BL_010 | baa-201 | $1000 | TOK | 20200927 |
| BL_011 | baa-201 |-$1000 | TOK | 20200927 |
| BL_012 | bab-210 | $1000 | TOK | 20200927 |
+----------+-----------+---------+------+------------+
期望结果:
+----------+-----------+---------+------+------------+
| BillId | SerialNo | BillAmt | Code | DispenseDt |
+----------+-----------+---------+------+------------+
| BL_001 | aaa-111 | $250 | AAP | 20200503 |
| BL_006 | aba-120 | $370 | TPP | 20201003 |
| BL_007 | aba-120 | $400 | TPP | 20200822 |
| BL_012 | bab-210 | $1000 | TOK | 20200927 |
+----------+-----------+---------+------+------------+
我的代码:
select a.SerialNo, a.BillAmt, a.Code, a.DispenseDt
from (
select *,
count(SerialNo) over(partition by SerialNo, DispenseDt) b
from Billing ) a
where b = 1
AND
InvoiceDt >= '20200601' And InvoiceDt <= '20200630'
AND
FacID IN ('IND600','IND605','IND610','IND620','IND630','IND640','IND650','IND660','IND670','IND680','IND690','IND695')
ORDER BY a.Serial;
答案 0 :(得分:0)
我试图解决它,但我自己有点卡住了。这里的逻辑是获取排名,然后过滤相同的排名,但不知何故我的代码创建了排名[使用 rank() 和 row_number() 创建了其中的 2 个],这将删除您需要作为输出的一些情况,如果有人否则可以编辑此代码吗?那就太好了
小提琴链接: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=e0c990d3694ad99b628b3e05a5de624f
select
Bill_ID,
Code,
DispenseDt,
new_bill_amt,
rank()
over(partition by new_bill_amt,DispenseDt, code) as rank_,
row_number()
over(partition by new_bill_amt,DispenseDt, code) as rank_2
from (
select
*,
replace(billamt,'-','') as new_bill_amt
from Billing
) as f
答案 1 :(得分:0)
我认为这可能会奏效。
(我使用了 CTE,但您可以将其转换为子查询。)
WITH base_cte AS (
SELECT
B1.SerialNo
, SUM(B1.BillAmt) AS [TotAmt]
, B1.Code
, B1.DispenseDt
FROM #Billing AS B1
GROUP BY
B1.SerialNo
, B1.Code
, B1.DispenseDt
)
SELECT
B.BillId
, B.SerialNo
, B.BillAmt
, B.code
, B.DispenseDt
FROM #Billing AS B
LEFT JOIN base_cte AS X ON X.SerialNo = B.SerialNo
WHERE X.TotAmt = B.BillAmt
AND X.DispenseDt = B.DispenseDt
输出:
BillId SerialNo BillAmt code DispenseDt
BL_001 aaa-111 250.00 AAP 2020-05-03
BL_006 aba-120 370.00 TPP 2020-10-03
BL_007 aba-120 400.00 TPP 2020-08-22
BL_012 bab-210 1000.00 TOK 2020-09-27
编辑:这是 OVER() 的不同方法。
SELECT
Y.BillId
, Y.SerialNo
, Y.BillAmt
, Y.Code
, Y.DispenseDt
FROM (
SELECT X.*
, [Ct] = COUNT(*) OVER(PARTITION BY X.code, X.TotAmt, X.DispenseDt ORDER BY X.SerialNo, X.code, X.DispenseDt)
FROM (
SELECT
B.BillId
, B.SerialNo
, B.BillAmt
, B.code
, B.DispenseDt
, [TotAmt] = SUM(B.BillAmt) OVER(PARTITION BY B.SerialNo, B.code, B.DispenseDt ORDER BY B.SerialNo, B.code, B.DispenseDt)
FROM #Billing AS B
) AS X
) AS Y
WHERE Y.BillAmt = Y.TotAmt
ORDER BY Y.BillId