我有一张表格,其中包含调查结果:
submitter issue q1 q2 q3 q4 q5
mike 11557 4 3 4 5 1
mark 13554 5 5 5 5 5
luke 15110 1 1 1 1 1
luke 15110 1 1 1 1 1
donald 16900 4 2 2 4 5
joe 11562 5 5 5 5 5
joe 11562 5 5 5 5 5
sam 12485 2 3 4 3 4
sam 12485 2 3 4 3 4
sam 12485 2 3 4 3 4
我希望能够过滤掉多个提交内容并只计算其中一个。 有些人提交了3到4次。
我知道如何查明调查提交的次数和由谁:
SELECT
submitter
,issue
,COUNT(*) as '# of times Survey submitted'
FROM
Survey
GROUP BY
submitter, issue
HAVING
COUNT(*) > 1
但是,我不确定如何使用此查询过滤掉多个提交。
我正在处理的当前查询是:
SELECT 'Question #1' as 'Survey Question'
,CAST(CAST(SUM(q1) AS float)/COUNT(q1) AS decimal (4,2)) as 'Average Score'
FROM Survey
WHERE COALESCE(q1,q2,q3,q4,q5) IS NOT NULL
UNION ALL
SELECT 'Question #2' as 'Survey Question'
,CAST(CAST(SUM(q2) AS float)/COUNT(q2) AS decimal (4,2)) as 'Average Score'
FROM Survey
WHERE COALESCE(q1,q2,q3,q4,q5) IS NOT NULL
UNION ALL
etc...
期望的结果是:(注意:此结果集不准确。只是我想要的格式。)
Survey Question Average Score
Question #1 4.58
Question #2 4.80
Question #3 4.60
Question #4 4.59
Question #5 4.64
任何人都可以提供线索吗?
非常感谢!
答案 0 :(得分:2)
我认为我的数学是正确的,但我的结果并不完全符合你的要求。你确定你想要的结果是正确的吗?
DECLARE @yourTable TABLE (submitter VARCHAR(10), Issue INT, q1 TINYINT, q2 TINYINT,q3 TINYINT, q4 TINYINT,q5 TINYINT);
INSERT INTO @yourTable
VALUES ('mike',11557,4,3,4,5,1),
('mark',13554,5,5,5,5,5),
('luke',15110,1,1,1,1,1),
('luke',15110,1,1,1,1,1),
('donald',16900,4,2,2,4,5),
('joe',11562,5,5,5,5,5),
('joe',11562,5,5,5,5,5),
('sam',12485,2,3,4,3,4),
('sam',12485,2,3,4,3,4),
('sam',12485,2,3,4,3,4);
WITH CTE_Distinct
AS
(
SELECT DISTINCT *
FROM @yourTable --just change this to your actual table name.
)
SELECT REPLACE(question,'q','Question #') AS [Survey Question],
CAST(AVG(val * 1.0) AS DECIMAL(4,2)) AS [Average Score]
FROM CTE_Distinct
UNPIVOT
(
val FOR question IN (q1,q2,q3,q4,q5)
) unpvt
GROUP BY question
结果:
Survey Question Average Score
-------------------- ---------------------------------------
Question #1 3.50
Question #2 3.17
Question #3 3.50
Question #4 3.83
Question #5 3.50
答案 1 :(得分:1)
WITH TestData AS (
SELECT *
FROM (VALUES
('Mike', 11557, 4, 3, 4, 5, 1)
, ('Mark', 13554, 5, 3, 5, 5, 5)
, ('Luke', 15110, 1, 1, 1, 1, 1)
, ('Luke', 15110, 1, 1, 1, 1, 1)
, ('Donald', 16900, 4, 2, 2, 4, 5)
, ('Joe', 11562, 5, 5, 5, 5, 5)
, ('Joe', 11562, 5, 5, 5, 5, 5)
, ('Sam', 12485, 2, 3, 4, 3, 4)
, ('Sam', 12485, 2, 3, 4, 3, 4)
, ('Sam', 12485, 2, 3, 4, 3, 4)
) A (Submitter, Issue, Q1, Q2, Q3, Q4, Q5)
)
SELECT SurveyQuestion
, AverageScore = AVG(QuestionAnswer * 1.) -- Change the math here if this isn't what you want
FROM (
SELECT A.Submitter
, A.Issue
, B.SurveyQuestion
, B.QuestionAnswer
, RowNum = ROW_NUMBER() OVER(PARTITION BY A.Submitter, A.Issue, B.SurveyQuestion ORDER BY (SELECT NULL)) -- Replace ORDER BY (SELECT NULL) with something more meaningful if you can
FROM TestData A
CROSS APPLY(VALUES -- Unpivot
('Question #1', A.Q1)
, ('Question #2', A.Q2)
, ('Question #3', A.Q3)
, ('Question #4', A.Q4)
, ('Question #5', A.Q5)
) B (SurveyQuestion, QuestionAnswer)
WHERE B.SurveyQuestion IS NOT NULL
) A
WHERE RowNum = 1
GROUP BY SurveyQuestion;
答案 2 :(得分:0)
我认为您可以应用的第一个解决方案是:选择提交者并发布每个求和者给出的每个答案的最大值:
select *
from survey
where (submitter, issue, id ) in
(
select submitter, issue, max(id)
from survey
group by submitter, issue);
但是这个解决方案的问题在于它给出了每个问题最大的答案,这可能不是所需的输出。
另一种方法是在每个寄存器中添加一个id:
select avg(q1) as avg_q1,
avg(q2) as avg_q2,
....
from survey
where (submitter, issue, id ) in
(
select submitter, issue, max(id)
from survey
group by submitter, issue);
使用id标记每一行不同,这是另一个keetle的鱼。选择更加简单:
{{1}}
内部选择(具有分组的那个)标识您想要获得的ID,第二个选择检索所有信息:提交者,ID和答案。您可以使用max()将最后一个答案作为 good 答案检索,或者您可以将其与min()一起使用以检索第一个答案。
<强>更新强>
对不起,我没看过&#34;平均&#34;请求你。如果你想要一个平均值而不是答案,我谦卑地推荐第二种方法。然后选择:
{{1}}