创建视图,根据不同的条件从表中删除多个数据切片

时间:2012-11-26 03:30:10

标签: sql sql-server sql-server-2005

下表包含PC资产信息,我需要根据不同的标准从中删除数据片段。

我需要在SQL Server 2005中创建一个返回结果的视图。

我尝试使用临时表来实现目标,直到我意识到我无法在View中使用临时表。

然后我尝试使用CTE,直到我意识到从CTE删除数据也会删除实际表中的数据。

我无法从实际表中删除数据。我无法在数据库中创建另一个表。

该表有160,000条记录。

表格:

TABLE dsm_hardware_basic
(
[UUID] binary(16) -- Randomly generated 16 digit key that is unique for each record, only column with no duplicate rows.
[HostUUID] binary(16) -- Randomly generated 16 digit key, column has duplicate rows.
[Name] nvarchar(255) -- Column that contains hostnames of computer assets. Example of record: PCASSET001. Column has duplicate rows.
[LastAgentExecution] datetime -- The last time that the software agent that collects asset information ran on the PC.
[HostName] nvarchar(255) -- The fully qualified domain name of the PC. Example of record: PCASSET001.companydomain.com. Column has duplicate rows.
)

我将解释我想要完成的事情:

1)读入表dbo.dsm_hardware_basic中的所有信息。让我们称之为:dsm_hardware_basic_copy。

2)查询dbo.dsm_hardware_basic并从dsm_hardware_basic_copy中删除符合以下条件的数据。    这基本上删除了具有最早[LastAgentExecution]时间的重复[HostUUID]。:

    SELECT   ,dsm_hardware_basic.[HostUUID]
             ,MIN(dsm_hardware_basic.[LastAgentExecution]) AS [LastAgentExecution]
    FROM      dsm_hardware_basic
    WHERE     dsm_hardware_basic.[HostUUID] <> ''
    GROUP BY  dsm_hardware_basic.[HostUUID]
              HAVING COUNT(*) = 2 -- The tiny amount of rows where this count is >2 will be left alone.

3)Additionaly查询dbo.dsm_hardware_basic并从dsm_hardware_basic_copy中删除符合以下条件的数据:    这基本上删除了具有最旧[LastAgentExecution]时间的副本[HostName]。:

    SELECT   ,dsm_hardware_basic.[HostName]
             ,MIN(dsm_hardware_basic.[LastAgentExecution]) AS [LastAgentExecution]
    FROM      dsm_hardware_basic
    WHERE     dsm_hardware_basic.[HostName] <> ''
    GROUP BY  dsm_hardware_basic.[HostName]
              HAVING COUNT(*) > 1

我不确定如何在上面的选择中执行此操作,但不仅应该[HostName]的COUNT是&gt; 1,但[Name]应该等于[HostName]中第一个句点之前的[HostName]中的所有内容。示例[名称]:PCASSET001。示例[HostName]:PCASSET001.companydomain.com。我知道这听起来很奇怪,考虑到我们在这两列中讨论的PC数据类型,但这是我真正需要应对的事情。

3)另外查询dbo.dsm_hardware_basic并从dsm_hardware_basic_copy中删除符合以下条件的数据:

这基本上删除了具有最早[LastAgentExecution]时间的副本[Name]。:

    SELECT   ,dsm_hardware_basic.[Name]
             ,MIN(dsm_hardware_basic.[LastAgentExecution]) AS [LastAgentExecution]
    FROM      dsm_hardware_basic
    WHERE     dsm_hardware_basic.[Name] <> ''
    GROUP BY  dsm_hardware_basic.[Name]
              HAVING COUNT(*) = 2 -- The tiny amount of rows where this count is >2 will be left alone.

1 个答案:

答案 0 :(得分:0)

你实际上已经在这里提出了几个不同的问题,我不确定我是否完全遵循查询的逻辑,但构建它应该不会太困难。

首先,您可以直接使用dsm_hardware_basic而不是副本:

SELECT 
    * 
FROM dsm_hardware_basic

现在是

的部分
  

删除带有最旧[LastAgentExecution]的重复[HostUUID]   时间

SELECT 
    dsm_hardware_basic.* 
FROM dsm_hardware_basic
INNER JOIN 
    (
        SELECT [UUID], ROW_NUMBER() OVER 
            (PARTITION BY [HostUUID] 
             ORDER BY [LastAgentExecution] DESC) AS host_UUID_rank
        FROM dsm_hardware_basic
        WHERE 
            [HostUUID] <> ''
    ) AS 
    duplicate_host_UUID_filtered ON dsm_hardware_basic.UUID = duplicate_host_UUID_filtered.UUID
    AND duplicate_host_UUID_filtered.host_UUID_rank = 1

我们所做的是按最新HostUUID排序LastAgentExecution对您的表格进行分区,并使用JOIN从查询中删除与我们的结果匹配的每个UUID。

我们现在可以对您的HostName

应用相同的逻辑
SELECT 
    dsm_hardware_basic.* 
FROM dsm_hardware_basic
INNER JOIN 
    (
        SELECT [UUID], ROW_NUMBER() OVER 
            (PARTITION BY [HostUUID] 
             ORDER BY [LastAgentExecution] DESC) AS host_UUID_rank
        FROM dsm_hardware_basic
        WHERE 
            [HostUUID] <> ''
    ) AS 
    duplicate_host_UUID_filtered ON dsm_hardware_basic.UUID = duplicate_host_UUID_filtered.UUID
    AND duplicate_host_UUID_filtered.host_UUID_rank = 1
INNER JOIN 
    (
        SELECT [UUID], ROW_NUMBER() OVER 
            (PARTITION BY [HostName] 
             ORDER BY [LastAgentExecution] DESC) AS host_UUID_rank
        FROM dsm_hardware_basic
        WHERE 
            [HostName] <> ''
    ) AS 
    duplicate_HostName_filtered ON dsm_hardware_basic.UUID = duplicate_HostName_filtered.UUID
    AND duplicate_HostName_filtered.host_UUID_rank = 1

我将把最后一部分作为练习留给你。最后,在完成调试后,只需添加CREATE VIEW即可。