将唯一数据插入表中

时间:2018-06-14 14:03:16

标签: sql sql-server

我目前正在使用SQL Server升级数据库。目前我正在尝试清理一个表,以摆脱一大堆重复记录。但是,我似乎无法让我的查询正常工作。

CREATE TABLE Temp_A
(
    Order_ID INT NOT NULL,
    Job_Number VARCHAR(20) NOT NULL,
    Supplier_Name VARCHAR(50) NOT NULL 
);

BULK INSERT Temp_A
FROM 'This\is\the\file\path.csv'
WITH (FIELDTERMINATOR = ',', ROWTERMINATOR = '\n')

CREATE TABLE Temp_B
(
    Order_ID INT NOT NULL,
    Job_Number VARCHAR(20) NOT NULL,
    Supplier_Name VARCHAR(50) NOT NULL 

    CONSTRAINT Temp_Con UNIQUE (Order_ID, Job_Number)
);

INSERT INTO Temp_B
    SELECT Order_ID, Job_Number, Supplier_Name
    FROM Temp_A AS A
    WHERE NOT EXISTS (SELECT 1 
                      FROM Temp_B AS B
                      WHERE B.Order_ID = A.Order_ID
                        AND B.Job_Number = A.Job_Number)

我的代码中无效的部分是最后的INSERT INTO Temp_B块。我正在做的是将CSV文件中的数据插入Temp_A表,然后尝试抓取所有唯一的Order_ID & Part_Number对,并将它们存储在Temp_B表中。

我喜欢进去并手动删除这些副本但是有成千上万的记录所以......是的,这将永远需要。我不知道从哪里开始。

编辑:要添加我收到的错误消息:

  

违反UNIQUE KEY约束'Temp_Con'。无法在对象'dbo.Temp_B'中插入重复键。重复键值为(3,L154)

4 个答案:

答案 0 :(得分:2)

您有两列唯一的列,但您的源数据有3.如果您有多个行具有相同的Order_IDJob_Number,您会选择哪一行?

GROUP BYMAX()一起使用。

INSERT INTO Temp_B (
    Order_ID, 
    Job_Number, 
    Supplier_Name
SELECT 
    Order_ID, 
    Job_Number, 
    Supplier_Name = MAX(Supplier_Name)
FROM 
    Temp_A AS A
WHERE 
    NOT EXISTS (
        SELECT 
            'not yet in Temp_B' 
        FROM 
            Temp_B AS B
        WHERE 
            B.Order_ID = A.Order_ID AND 
            B.Job_Number = A.Job_Number)
GROUP BY
    A.Order_ID,
    A.Job_Number

使用ROW_NUMBER()

;WITH MissingRanked AS
(
    SELECT 
        Order_ID, 
        Job_Number, 
        Supplier_Name,
        Ranking = ROW_NUMBER() OVER (
            PARTITION BY 
                A.Order_ID, 
                Job_Number 
            ORDER BY 
                (SELECT NULL)) -- Your ordering criteria here
    FROM 
        Temp_A AS A
    WHERE 
        NOT EXISTS (
            SELECT 
                'not yet in Temp_B' 
            FROM 
                Temp_B AS B
            WHERE 
                B.Order_ID = A.Order_ID AND 
                B.Job_Number = A.Job_Number)
)
INSERT INTO Temp_B (
    Order_ID, 
    Job_Number, 
    Supplier_Name
SELECT
    Order_ID, 
    Job_Number, 
    Supplier_Name
FROM
    MissingRanked AS M
WHERE
    M.Ranking = 1

答案 1 :(得分:0)

我会尝试使用GROUP来使我的INSERT INTO独一无二,就像这样:

INSERT INTO Temp_B
SELECT Order_ID, Job_Number, Supplier_Name
FROM Temp_A AS A
GROUP BY A.Order_ID, A.Job_Number, A.Supplier_Name

我没有要测试的数据,但我认为这样可行。你的问题有Order_ID & Part_Number,但写的连接没有,我猜一个类型-o但你明白了。这是我要去的方向。您也可以使用DISTINCT,但我喜欢GROUP BY

答案 2 :(得分:0)

您的方法不起作用,因为子选择会在插入之前看到记录 - 也就是它看到一个空表。

您需要的是DISTINCT关键字。

INSERT INTO Temp_B
SELECT DISTINCT Order_ID, Job_Number, Supplier_Name
FROM Temp_A

答案 3 :(得分:0)

您可以在INSERT查询中添加DISTINCT关键字:

INSERT INTO Temp_B
SELECT DISTINCT Order_ID, Job_Number, Supplier_Name
FROM Temp_A AS A
WHERE NOT EXISTS (
SELECT 1 FROM Temp_B AS B
WHERE B.Order_ID = A.Order_ID
AND B.Job_Number = A.Job_Number);