从SQL结果集中选择具有最大计数的记录

时间:2015-09-17 13:59:20

标签: sql-server

我有一个表格,目前有客户customerIdfilePath列,但没有限制,因此可能有多条记录包含不同文件路径的同一客户。我正在尝试编写的脚本的目标是获取包含customerIdfilePath的结果,其中filePath是主要用于此客户的结果(Count(filePath)是最大值)以便结果中的每条记录在第一列和第二列中包含唯一customerId的值filePath,其中包含与此客户关联的大多数记录。

所以我到现在所拥有的是:

SELECT customerId, localFilePath, Count(customerId) as Count1 
FROM CustomerDetails
GROUP BY localFilePath, customerId

返回:

CustomerId LocalFilePath Count1
3          AnotherFilePath  1
3          localFilePath    3
11         localFilePath    1
331        localFilePath    1
2414       localFilePath    3
2527       localFilePath    1
2528       localFilePath    1
2533       localFilePath    1
2535       localFilePath    1

目前只有身份3的客户有多个值,但无论是否有一个或多个用户,因为在这种情况下,我想返回其余用户的结果,因为他们没有多个文件路径但是我想以Count1 = 3customerId = 3后退行,如结果所示。

修改

预期结果是:

CustomerId LocalFilePath Count1
3          localFilePath    3
11         localFilePath    1
331        localFilePath    1
2414       localFilePath    3
2527       localFilePath    1
2528       localFilePath    1
2533       localFilePath    1
2535       localFilePath    1

我们的想法是所有唯一的customerId都必须保留,并且应该过滤具有相同customerId的记录,因此它只允许使用此customerId的一条记录,该记录在Count1列中具有更高的值

5 个答案:

答案 0 :(得分:3)

使用CTE:

;WITH
    cte AS  
    (
        SELECT customerId, localFilePath, Count(customerId) as Count1 
        FROM CustomerDetails
        GROUP BY localFilePath, customerId
    )

SELECT  customerid, localFilePath, Count1
FROM    (
            SELECT  customerid, localFilePath, Count1,
                    rn = ROW_NUMBER() OVER (PARTITION BY customerID ORDER BY Count1 DESC)
            FROM    cte
        ) temp
WHERE   temp.rn = 1

答案 1 :(得分:2)

分析功能对您的情况很有帮助。此查询将按客户对位置计数进行排名:

SELECT
  localFilePath,
  RANK() OVER (
    PARTITION BY customerId
    ORDER BY Count1 DESC) AS CountRank
FROM (
  SELECT customerId, localFilePath, COUNT(*) AS Count1
  FROM CustomerDetails
  GROUP BY customerId, localFilePath
) InitCalc

每个客户最常用的位置的CountRank值为1.要将结果限制为CountRank = 1的行,您必须再次打包查询:

SELECT * FROM (
  SELECT
    localFilePath,
    RANK() OVER (
      PARTITION BY customerId
      ORDER BY Count1 DESC) AS CountRank
  FROM (
    SELECT customerId, localFilePath, COUNT(*) AS Count1
    FROM CustomerDetails
    GROUP BY customerId, localFilePath
  ) InitCalc
) CountCalc
WHERE CountRank = 1

如果客户最常用的位置与之相关,则上述查询将返回每个客户的所有首位记录。如果您只想为每个客户分配一个位置,请将RANK()更改为ROW_NUMBER(),但请注意,这会从绑定值中随意选择一位获胜者。

答案 2 :(得分:0)

您可以使用此查询:

with cte as
(
SELECT customerID, localFilePath, Count(customerID) as Count1 
FROM CustomerDetails
GROUP BY localFilePath, customerID
)
,
cte1
as
(select customerID,max(Count1) as count2 from cte group by customerID)
select cte.customerid,cte.localfilepath,cte.count1 from cte inner join
cte1 on cte.customerid=cte1.customerid and cte.count1=cte1.count2 order by
customerid

现在,如果(例如)ID 3的两个计数相同,即在“AnotherFilePath”的customerdetails表中还有两行,那么这将为ID 3生成两行,然后您必须决定根据一个选择的标准。这是重要的吗?如果任何ID记录了相同数量的文件路径,您是否有办法在文件路径之间做出决定?

答案 3 :(得分:0)

你可以这样做

-- create table variable
DECLARE @Table TABLE( 
  id int,
  filePath varchar(30) NOT NULL, 
  Count1 int
); 

INSERT INTO @Table 
SELECT customerId, localFilePath, Count(customerId) as Count1 
FROM CustomerDetails
GROUP BY localFilePath, customerId

-- Then select the max
SELECT m.id, m.MaxCount, (SELECT Top 1 filePath from @Table where id = m.id and Count1 = m.MaxCount) as ThePath
FROM (
    SELECT t.id, MAX(t.Count1) as MaxCount
    FROM @Table t
    GROUP BY t.id
) m

结果是:

id      MaxCount    ThePath
3       3           localFilePath
11      1           localFilePath
331     1           localFilePath
2414    3           localFilePath
2527    1           localFilePath
2528    1           localFilePath
2533    1           localFilePath
2535    1           localFilePath

答案 4 :(得分:0)

没有CTE:

SELECT top 1 customerId, localFilePath, Count(customerId) as Count1 
FROM CustomerDetails
GROUP BY localFilePath, customerId
order by Count(customerId) desc