Question

我有一个表格，目前有客户customerId和filePath列，但没有限制，因此可能有多条记录包含不同文件路径的同一客户。我正在尝试编写的脚本的目标是获取包含customerId和filePath的结果，其中filePath是主要用于此客户的结果（Count（filePath）是最大值）以便结果中的每条记录在第一列和第二列中包含唯一customerId的值filePath，其中包含与此客户关联的大多数记录。

所以我到现在所拥有的是：

SELECT customerId, localFilePath, Count(customerId) as Count1 
FROM CustomerDetails
GROUP BY localFilePath, customerId

返回：

CustomerId LocalFilePath Count1
3          AnotherFilePath  1
3          localFilePath    3
11         localFilePath    1
331        localFilePath    1
2414       localFilePath    3
2527       localFilePath    1
2528       localFilePath    1
2533       localFilePath    1
2535       localFilePath    1

目前只有身份3的客户有多个值，但无论是否有一个或多个用户，因为在这种情况下，我想返回其余用户的结果，因为他们没有多个文件路径但是我想以Count1 = 3为customerId = 3后退行，如结果所示。

修改

预期结果是：

CustomerId LocalFilePath Count1
3          localFilePath    3
11         localFilePath    1
331        localFilePath    1
2414       localFilePath    3
2527       localFilePath    1
2528       localFilePath    1
2533       localFilePath    1
2535       localFilePath    1

我们的想法是所有唯一的customerId都必须保留，并且应该过滤具有相同customerId的记录，因此它只允许使用此customerId的一条记录，该记录在Count1列中具有更高的值

Answer 1

使用CTE：

;WITH
    cte AS  
    (
        SELECT customerId, localFilePath, Count(customerId) as Count1 
        FROM CustomerDetails
        GROUP BY localFilePath, customerId
    )

SELECT  customerid, localFilePath, Count1
FROM    (
            SELECT  customerid, localFilePath, Count1,
                    rn = ROW_NUMBER() OVER (PARTITION BY customerID ORDER BY Count1 DESC)
            FROM    cte
        ) temp
WHERE   temp.rn = 1

Answer 2

分析功能对您的情况很有帮助。此查询将按客户对位置计数进行排名：

SELECT
  localFilePath,
  RANK() OVER (
    PARTITION BY customerId
    ORDER BY Count1 DESC) AS CountRank
FROM (
  SELECT customerId, localFilePath, COUNT(*) AS Count1
  FROM CustomerDetails
  GROUP BY customerId, localFilePath
) InitCalc

每个客户最常用的位置的CountRank值为1.要将结果限制为CountRank = 1的行，您必须再次打包查询：

SELECT * FROM (
  SELECT
    localFilePath,
    RANK() OVER (
      PARTITION BY customerId
      ORDER BY Count1 DESC) AS CountRank
  FROM (
    SELECT customerId, localFilePath, COUNT(*) AS Count1
    FROM CustomerDetails
    GROUP BY customerId, localFilePath
  ) InitCalc
) CountCalc
WHERE CountRank = 1

如果客户最常用的位置与之相关，则上述查询将返回每个客户的所有首位记录。如果您只想为每个客户分配一个位置，请将RANK()更改为ROW_NUMBER()，但请注意，这会从绑定值中随意选择一位获胜者。

Answer 3

您可以使用此查询：

with cte as
(
SELECT customerID, localFilePath, Count(customerID) as Count1 
FROM CustomerDetails
GROUP BY localFilePath, customerID
)
,
cte1
as
(select customerID,max(Count1) as count2 from cte group by customerID)
select cte.customerid,cte.localfilepath,cte.count1 from cte inner join
cte1 on cte.customerid=cte1.customerid and cte.count1=cte1.count2 order by
customerid

现在，如果（例如）ID 3的两个计数相同，即在“AnotherFilePath”的customerdetails表中还有两行，那么这将为ID 3生成两行，然后您必须决定根据一个选择的标准。这是重要的吗？如果任何ID记录了相同数量的文件路径，您是否有办法在文件路径之间做出决定？

Answer 4

你可以这样做

-- create table variable
DECLARE @Table TABLE( 
  id int,
  filePath varchar(30) NOT NULL, 
  Count1 int
); 

INSERT INTO @Table 
SELECT customerId, localFilePath, Count(customerId) as Count1 
FROM CustomerDetails
GROUP BY localFilePath, customerId

-- Then select the max
SELECT m.id, m.MaxCount, (SELECT Top 1 filePath from @Table where id = m.id and Count1 = m.MaxCount) as ThePath
FROM (
    SELECT t.id, MAX(t.Count1) as MaxCount
    FROM @Table t
    GROUP BY t.id
) m

结果是：

id      MaxCount    ThePath
3       3           localFilePath
11      1           localFilePath
331     1           localFilePath
2414    3           localFilePath
2527    1           localFilePath
2528    1           localFilePath
2533    1           localFilePath
2535    1           localFilePath

Answer 5

没有CTE：

SELECT top 1 customerId, localFilePath, Count(customerId) as Count1 
FROM CustomerDetails
GROUP BY localFilePath, customerId
order by Count(customerId) desc

从SQL结果集中选择具有最大计数的记录

5 个答案: