需要有关SQL事务的协助以压缩数据库中的数据

时间:2019-01-18 00:55:56

标签: sql sql-server sqltransaction

我正在尝试压缩数据库表中的数据,该表包含具有不同列数据的唯一记录的多个实例。

  

我想为每个列选择最高出现值   特殊的独特记录

但是我的SQL事务无法正常工作。

[dataBase1].[dbo].[table1]有数十万条记录,其中有几列(Name, Place, etc.)

[dataBase1].[dbo].[table2]包含[table1]中唯一名称的列表以及其余为空的列(页头等)的标题。

我尝试了以下代码。

DECLARE @name varchar(max);
DECLARE @place varchar(max);

DECLARE db_cursor SCROLL CURSOR FOR 
     SELECT [Name] 
     FROM [dataBase1].[dbo].[table2];

OPEN HostName_cursor

FETCH NEXT FROM db_cursor INTO @name;

WHILE @@FETCH_STATUS = 0
BEGIN
     SELECT DISTINCT TOP(1) @place = [Place] 
     FROM [dataBase1].[dbo].[table1] 
     WHERE [Name] = @name 
       AND [Place] IS NOT NULL AND [Place] <> '' 
       AND (EXISTS  (SELECT [Place], COUNT (*) AS TOTAL 
                     FROM [dataBase1].[dbo].[table1] 
                     GROUP BY [Place])) 
     GROUP BY [Place];

     UPDATE [dataBase1].[dbo].[table2] 
     SET [Place] = @place 
     WHERE [Name] = @name;      

     SET @place = '';

     FETCH NEXT FROM db_cursor INTO @name
END
特定唯一[Place]

[Name]列具有 53 值,最高重复值计数为 3 。本质上,我想为每个唯一的[Name]自动执行以下SQL事务。

SELECT DISTINCT TOP 1 
    [Place], COUNT (*) TOTAL 
FROM 
    [dataBase1].[dbo].[table1] 
WHERE 
    [Name] = 'xxxxxx' 
    AND [Place] IS NOT NULL AND [Place] <> '' 
GROUP BY [Place] 
ORDER BY TOTAL DESC;

1 个答案:

答案 0 :(得分:0)

这可以通过许多步骤来完成,每个步骤都在接下来的步骤中进行。您想一次处理所有名称和地点。

首先,您要统计每个名称的数量,地点组合,因此请按名称和地点分组并计算地点。您的查询将如下所示

SELECT name, place, COUNT(place) as placecount
FROM table1
GROUP BY name, place

现在,您需要找到计数最高的一个,如果是平局,则需要按字母顺序查找第一个。您可以通过对上述结果进行ROW_NUMBER,重新启动名称的计数(分区),然后按地点计数,然后按地点排序来解决关系,来完成此操作。使用CTE(您也可以将其作为子查询来执行),看起来就像

WITH places as (
  SELECT name, place, COUNT(place) as placecount
  FROM table1
  GROUP BY name, place
)
SELECT name, place, ROW_NUMBER() OVER (PARTITION BY name ORDER BY placecount, place) as RN
FROM places

如果查看这些数据,则任何给定名称的位置都应该在RN为1的行上。这样,您就可以通过这样的查询获得所需的最终数据

WITH places as (
  SELECT name, place, COUNT(place) as placecount
  FROM table1
  GROUP BY name, place
), orderplaces as (
  SELECT name, place, ROW_NUMBER() OVER (PARTITION BY name ORDER BY placecount, place) as RN
  FROM places
)
Select name, place
FROM orderplaces
WHERE RN = 1

由于要使用此位置数据更新table2而不是查看它,您将在最终查询中加入table2并进行更新,类似这样

WITH places as (
  SELECT name, place, COUNT(place) as placecount
  FROM table1
  GROUP BY name, place
), orderplaces as (
  SELECT name, place, ROW_NUMBER() OVER (PARTITION BY name ORDER BY placecount, place) as RN
  FROM places
)
UPDATE T2 set place = OP.place
FROM orderplaces OP
   INNER JOIN table2 T2 on T2.name = OP.name
WHERE RN = 1;