在MySQL中执行统计分析的最佳方法

时间:2013-10-08 12:32:44

标签: mysql algorithm

我有一个包含320列的表格。每列可包含五个字母(a,b,c,d,e)中的一个 - 多项选择测试。现在我想进行统计分析,记住如果10个人中有9人回答'b','b'的问题可能是正确的。

如何以最有效的方式完成?我已经考虑过具有order by和limits的视图,但这对320列有效吗?

2 个答案:

答案 0 :(得分:0)

你需要对每一列进行数学计算。 sql用于计算行数。

获取每个问题的答案数:

select * from
(select a as answer union select b union select c union select d union select e) answers

left join 
(select answer_to_q1 as answer, count(*) as q1 from table group by 1) q1 on q1.answer=answers.answer

left join 
(select answer_to_q2 as answer, count(*) as q2 from table group by 1) q2 on q2.answer=answers.answer

... repeat for all columns

获得最高计数的答案,其中q1,q2等是您对q1 q2的答案的列。

select 1 as question, q1 from table group by q1 order by count(*) desc limit 1
union all
select 2, q2 from table group by q1 order by count(*) desc  limit 1
....

答案 1 :(得分:0)

您的架构远非最佳。

如果您规范化数据结构,您会发现它更容易。制作一个名为答案的表:

create table answer (
    questionnaire_id int, -- an id number for which questionnaire this is 
    question_id int,      -- the id number of the question
    answer enum('a','b','c','d','e') -- the value of the answer 
);

然后,您可以使用以下查询查看每个问题的分布:

select question_id, answer,count(*)
from answer
group by question_id, answer; -- just one example of how to look at the answers