检查两列,并删除名为contentid的列的ID的最小编号

时间:2019-05-29 15:03:54

标签: sql sql-server

数据来自:

SELECT 
    CONTENTID, t1.TITLE, t1.PAGEID, COUNT, SPACENAME, CREATIONDATE, LASTMODDATE, VERSION 
FROM
    (SELECT 
         SPACEID, TITLE, PAGEID, COUNT(*) AS COUNT
     FROM 
         CONTENT
     WHERE 
         CONTENTTYPE = 'ATTACHMENT'
     GROUP BY 
         TITLE, PAGEID, SPACEID
     HAVING 
         COUNT(TITLE) > 1 AND COUNT(PAGEID) > 1) t1
JOIN
    (SELECT 
         CONTENTID, CREATIONDATE, LASTMODDATE, VERSION, TITLE, PAGEID 
     FROM 
         CONTENT 
     WHERE 
         VERSION = 1) t4 ON t4.PAGEID = t1.PAGEID
JOIN
    (SELECT 
         SPACEID, SPACENAME 
     FROM 
         SPACES) t2 ON t1.SPACEID = t2.SPACEID 
ORDER BY 
    t1.PAGEID,t1.TITLE, CREATIONDATE, LASTMODDATE

输出(删除了一些列,因为它只是决定要删除什么的前两列,并且在此处更容易显示):

CONTENTID   TITLE
--------------------------------------------
26902677    Time Logging Guidelines V5.docx
46170401    Time Logging Guidelines V5.docx
157909073   Time Logging Guidelines V5.docx
157909072   Time Logging Guidelines V5.docx
355860497   Time Logging Guidelines V5.docx
535953771   Time Logging Guidelines V5.docx
540117589   Time Logging Guidelines V5.docx
554729950   Time Logging Guidelines V5.docx
1246646     Induction Plan Template.docx
472350756   Induction Plan Template.docx
535953845   Induction Plan Template.docx
544508546   Induction Plan Template.docx
544508547   Induction Plan Template.docx

除了每个标题的最高ContentID,我需要删除所有内容。实际上,大约有66k的此类修复程序行

上面输出的最终结果:

554729950   Time Logging Guidelines V5.docx
544508547   Induction Plan Template.docx

3 个答案:

答案 0 :(得分:1)

如果我对您的理解正确,那就很简单:

DELETE FROM CONTENT WHERE CONTENTID NOT IN (SELECT MAX(CONTENTID) FROM CONTENT GROUP BY TITLE)

答案 1 :(得分:0)

一种方法是使用ROW_NUMBER()为数据的每一行生成一个数字,并以PARTITION随TITLE列降序生成。对于每个TITLE,最高的CONTENTID将为您提供1。然后,您可以删除行号不为1的所有内容。

这是一个例子。

-- create a temp table to store the test data
IF OBJECT_ID('tempdb..#test_data') IS NOT NULL DROP TABLE #test_data 
CREATE TABLE #test_data (
    content_id INT NOT NULL, 
    title VARCHAR(100) NOT NULL
)

-- add the test data
INSERT INTO #test_data
SELECT 26902677    , 'Time Logging Guidelines V5.docx'
UNION SELECT 46170401    , 'Time Logging Guidelines V5.docx'
UNION SELECT 157909073  , 'Time Logging Guidelines V5.docx'
UNION SELECT 157909072  , 'Time Logging Guidelines V5.docx'
UNION SELECT 355860497  , 'Time Logging Guidelines V5.docx'
UNION SELECT 535953771  , 'Time Logging Guidelines V5.docx'
UNION SELECT 540117589  , 'Time Logging Guidelines V5.docx'
UNION SELECT 554729950  , 'Time Logging Guidelines V5.docx'
UNION SELECT 1246646      , 'Induction Plan Template.docx'
UNION SELECT 472350756  , 'Induction Plan Template.docx'
UNION SELECT 535953845  , 'Induction Plan Template.docx'
UNION SELECT 544508546  , 'Induction Plan Template.docx'
UNION SELECT 544508547  , 'Induction Plan Template.docx';

-- generate an row number for each row of the data
WITH ordered AS (
    SELECT
        content_id, 
        title,
        ROW_NUMBER() OVER (PARTITION BY title ORDER BY content_id DESC) AS row_num
    FROM 
        #test_data
        )

-- delete all the rows that are not equal to 1
DELETE d
FROM #test_data d
INNER JOIN ordered o ON d.content_id = o.content_id AND o.row_num > 1

-- check the results
SELECT * FROM #test_data 

答案 2 :(得分:0)

不太清楚您想要什么。我不确定加入的目的。如果您只想删除所有记录,而CONTENTID最好,那么您可以尝试

WITH cte AS (
    SELECT *
        , ROW_NUMBER() OVER(PARTITION BY TITLE ORDER BY CONTENTID DESC) AS rn
    FROM Content
)
DELETE FROM cte
WHERE rn > 1