Tsql查询从两列中找到最重复的String答案?

时间:2018-07-20 17:06:51

标签: sql tsql

我有在不同日期对同一组用户进行的调查结果,我想为每个用户找到最重复的答案。

这就是我的数据的样子

enter image description here

这就是我要实现的目标

enter image description here

P

4 个答案:

答案 0 :(得分:1)

对于Microsoft SQL Server

  1. 分别获取具有唯一值计数的列
  2. 按每个用户细分的总和(分区依据)排序数据集
  3. 仅保留每个用户的最高记录(第1行)

代码:

DECLARE @tbl TABLE (UserId TINYINT, Form1 NVARCHAR(20), Form2 NVARCHAR(20))
INSERT @tbl (UserId,Form1,Form2) SELECT 1, 'Y', 'Y';
INSERT @tbl (UserId,Form1,Form2) SELECT 1, 'Y', 'Y';
INSERT @tbl (UserId,Form1,Form2) SELECT 1, 'D', 'Y';
INSERT @tbl (UserId,Form1,Form2) SELECT 1, 'D', 'D';
INSERT @tbl (UserId,Form1,Form2) SELECT 1, 'C', 'Y';
INSERT @tbl (UserId,Form1,Form2) SELECT 2, 'D', 'Y';
INSERT @tbl (UserId,Form1,Form2) SELECT 2, 'D', 'Y';
INSERT @tbl (UserId,Form1,Form2) SELECT 2, 'D', 'Y';
INSERT @tbl (UserId,Form1,Form2) SELECT 2, 'D', 'D';
INSERT @tbl (UserId,Form1,Form2) SELECT 2, 'C', 'Y';

SELECT *
FROM (
    SELECT
          UserId,[String],SUM(Cnt) AS [SumCnt]
        , ROW_NUMBER() OVER(PARTITION BY UserId ORDER BY SUM(Cnt) DESC) AS [Row]
    FROM (
        SELECT UserId,Form1 AS [String],COUNT(Form1) AS Cnt
        FROM @tbl
        GROUP BY UserId,Form1
        UNION
        SELECT UserId,Form2,COUNT(Form2) AS Cnt
        FROM @tbl
        GROUP BY UserId,Form2
    ) col
    GROUP BY UserId,[String]
) ord
WHERE ord.[Row]=1

答案 1 :(得分:1)

我会这样:

select dv.*
from (select d.userid, v.ans, count(*) as cnt,
             row_number() over (partition by d.userid order by count(*) desc) as seqnum
      from mydata d cross apply
           (values (form_1), (form_2)) v(ans)
      group by d.userid, v.ans
     ) dv
where seqnum = 1;

我认为上面的内容更容易理解,但是您可以在不使用子查询的情况下编写此代码:

select top (1) with ties d.userid, v.ans, count(*) as cnt
from mydata d cross apply
     (values (form_1), (form_2)) v(ans)
group by d.userid, v.ans
order by row_number() over (partition by d.userid order by count(*) desc) 

答案 2 :(得分:0)

您在这里:

create table tab (userid int, form1 nvarchar, form2 nvarchar);
insert tab values (1, 'y', 'n');
insert tab values (1, 'y', 'n');
insert tab values (1, 'n', 'n');
insert tab values (1, 'n', 'n');
insert tab values (2, 'y', 'y');
insert tab values (2, 'y', 'y');
insert tab values (2, 'n', 'n');
insert tab values (2, 'n', 'n');
insert tab values (3, 'y', 'y');

with x as (
  select userid, form1 as resp, count(*) as cnt from tab group by userid, form1
  union all
  select userid, form2, count(*) as cnt from tab group by userid, form2
  ),
  y as (
  select userid, resp, sum(cnt) as totcnt
    from x
    group by userid, resp
  ),
  z as (
  select userid, max(totcnt) as maxcnt from y group by userid
  )
select z.*, y.resp from z
  join y on y.userid = z.userid and y.totcnt = z.maxcnt

结果:

userid       maxcnt       resp  
--------------------------------
1            6            n     
2            4            n     
2            4            y     
3            2            y  

请注意,对于userid = 2,它显示了两个“最常见的答案”。

答案 3 :(得分:-1)

看看这个查询:

DECLARE @tbl TABLE (UserId INT, Form1 NVARCHAR, Form2 NVARCHAR)
INSERT @tbl VALUES (1, 'Y', 'N')
INSERT @tbl VALUES (1, 'Y', 'N')
INSERT @tbl VALUES (1, 'N', 'N')
INSERT @tbl VALUES (1, 'N', 'N')
INSERT @tbl VALUES (2, 'Y', 'Y')
INSERT @tbl VALUES (2, 'Y', 'Y')
INSERT @tbl VALUES (2, 'N', 'N')
INSERT @tbl VALUES (2, 'N', 'N')
INSERT @tbl VALUES (3, 'Y', 'Y')

SELECT X.UserId, SUM(Total) AS 'Total' FROM (
    SELECT UserId, CASE WHEN Form1 = 'Y' THEN 1 ELSE 0 END + CASE WHEN Form2 = 'Y' THEN 1 ELSE 0 END AS 'Total' FROM @tbl
) X
GROUP BY X.UserId
ORDER BY 2 DESC