SQL如何删除不正确的重复(不正确)值?

时间:2015-05-27 17:42:13

标签: mysql sql oracle

基本上我的问题是我需要删除不正确的重复值(请参见下文)。我无法使用Destinct函数,因为它会删除一些正确的值。 如果您有任何建议我会很感激。 如果您需要更多说明,请告诉我:)

我有两张桌子。

----------------------------------------------------------
CUSTOMER | Ammount | Invoice number | Time Stamp
----------------------------------------------------------
 A       | 57000,2 | 631            | Time Stamp
 A       | 56000   | 631            | Time Stamp
 A       | 55000,1 | 632            | Time Stamp
 A       | 54000   | 632            | Time Stamp

--------------------------------------------------------------------------
CUSTOMER |        FREE TEXT           |Invoice number| Time Stamp
--------------------------------------------------------------------------
 A       | 57.000,2 invoice number 631 | 631          | Time Stamp
 A       | 56.000   invoice number 631 | 631          | Time Stamp
 A       | 55.000,1 invoice number 632 | 632          | Time Stamp
 A       | 54.000   invoice number 632 | 632          | Time Stamp

我使用此查询:

Select A.CUTOMER, A.AMMOUNT, B.FREE_TEXT, B.Invoice_number   
FROM Table1 A,
Table2 B
WHERE A.CUSTOMER = B.CUSTOMER
AND A.Invoice_number = B.Invoice_number
AND B.Invoice_number IN ('631','632')
AND A.CUSTOMER = 'A'
AND B.Time_stamp >= TIMESTAMP('2015-01-01 00:00:00')
AND A.Time_stamp >= TIMESTAMP('2015-01-01 00:00:00')

结果是重复的,其中1个不正确,结果如下:

 A       | 57000,2 | 57.000,2 invoice number 631  | 631 
 A       | 56000   | 57.000,2 invoice number 631  | 631
 A       | 57000,2 | 56.000   invoice number 631  | 631 
 A       | 56000   | 56.000   invoice number 631  | 631 
 A       | 55000,1 | 55.000,1 invoice number 632  | 632 
 A       | 54000   | 54.000   invoice number 632  | 632 
 A       | 55000,1 | 55.000,1 invoice number 632  | 632 
 A       | 54000   | 54.000   invoice number 632  | 632 

我希望它像:

 A       | 57000,2 | 57.000,2 invoice number 631  | 631  |
 A       | 56000   | 56.000   invoice number 631  | 631  |
 A       | 55000,1 | 55.000,1 invoice number 632  | 632  |
 A       | 54000   | 54.000   invoice number 632  | 632  |

3 个答案:

答案 0 :(得分:0)

按条件使用分组。

Select A.CUTOMER, A.AMMOUNT, B.FREE_TEXT, B.Invoice_number   
FROM Table1 A,
Table2 B
WHERE A.CUSTOMER = B.CUSTOMER
AND A.Invoice_number = B.Invoice_number
AND B.Invoice_number IN ('631','632')
AND A.CUSTOMER = 'A'
AND B.Time_stamp >= TIMESTAMP('2015-01-01 00:00:00')
AND A.Time_stamp >= TIMESTAMP('2015-01-01 00:00:00')
GROUP BY A.AMMOUNT 

答案 1 :(得分:0)

在这种情况下,我想了解这些表格的关系。根据您的评论,它取决于客户,invoice_number和金额,但金额在一个表格中的自由格式文本字段中。

如果我们假设格式在此free_text字段中是一致的....我们假设mySQl与Oracle。

SELECT A.CUTOMER, A.AMMOUNT, B.FREE_TEXT, B.Invoice_number   
FROM Table1 A 
INNER JOIN Table2 B
 on A.CUSTOMER = B.CUSTOMER
AND A.Invoice_number = B.Invoice_number
and concat_ws(' ', A.Amount, 'invoice number', A.invoice_number)= B.Free_Text
where 
AND B.Invoice_number IN ('631','632')
AND A.CUSTOMER = 'A'
AND B.Time_stamp >= TIMESTAMP('2015-01-01 00:00:00')
AND A.Time_stamp >= TIMESTAMP('2015-01-01 00:00:00')

虽然由于字符串连接,性能可能会缓慢。并且无法使用金额指数。

答案 2 :(得分:-1)

DISTINCT无法解决此问题。

与FREE_Text

相比,您似乎需要根据Ammount限制行数

下面的连接可能需要根据您使用的数据库引擎进行调整。

AND B.FREE_TEXT LIKE '%' + A.Ammount + '%'

AND B.FREE_TEXT LIKE CONCAT('%', A.Ammount,'%')

警告:如果您正在使用大量行,这可能效果不佳。