查询时间戳以仅为每个组选择最近的行

时间:2013-09-20 01:39:33

标签: mysql sql

我有一个表格可以记录用户对许多问题的答案:

TABLEA

user_id  | question_id | date answered | correct?
-------------------------------------------------
   66         345          timestamp        1
   34         654          timestamp        0
   34         654          timestamp        1

每个用户尝试的每个问题都存储在数据库中。

然后我还有一个类别列表和该类别中的question_ids。 e.g

tableB的

category_id    |    question_id
--------------------------------
   1                     34
   1                     44
   1                     23
   2                     99 
   2                     44

我正在尝试编写一个查询来计算用户之前已正确回答的类别中的问题百分比(correct? = 1)以及在回答的最后20个问题的正确问题百分比类别。

到目前为止,我可以做第一部分,但不是第二部分

SELECT category_id, COUNT(*), COUNT(correct?)
FROM tableA LEFT JOIN tableB USING (question_id)
WHERE user_id = 1
GROUP_BY category_id

这为我提供了类别中的总问题数量以及用户在类别中正确回答的问题数量。像这样的东西

cat_id  | total_questions | answered_correctly
-------------------------------------------------
 1           455               323
 2           334               123

但是,对于每个类别,我还想查看类别中回答的最后20个问题,并检索正确的数字。所以我想要这样的东西:

cat_id | total_questions | answered_correctly   | questions_correct_in_last_20_answered
-------------------------------------------------------------------------------------
 1           455                323                            12
 2           334                123                            8

3 个答案:

答案 0 :(得分:1)

嘿,朋友看这个:

  

Foo.c SELECT,COUNT(*)AS pct * t.factor FROM foo JOIN(SELECT   100 / COUNT(*)FROM foo AS factor)AS t GROUP BY foo.c;

哎呀!因此,足以使JOIN获得总用户数,并应用一些数学测试。   回到我的实际情况 - 隐喻,我们有:

SELECT count (id) AS pct * t.factor, good_person FROM people
  JOIN (SELECT 100/COUNT (*) FROM persons AS factor) AS t
  GROUP BY good_person;

原始链接(葡萄牙语),此处:MySQL Blog

答案 1 :(得分:1)

要添加最后20个已回答的问题,您需要选择最后20行,然后计算正确的答案,但GROUP BYLIMIT不能很好地结合在一起而且您无法添加最后一个除非您一次只检查一个类别,否则将排除20行。当其中一个子查询引用正在连接的表时,MySQL不允许您加入表。

因此,以下查询是一种解决方法,可以获取按时间戳排序的类别的所有答案,生成列表,先取20,然后计算正确答案的数量。棘手,但完成工作。

SELECT category_id,
       Total_Q_Tried,
       Total_Unique_Q_Tried,
       Total_Answered_Correctly,
       Total_Answered_Correctly / Total_Q_Tried*100 Total_Correct_Answer_Percentage,
       Total_Answered_Correctly_In_Last20,
       Total_Answered_Correctly_In_Last20 / LEAST(20,Total_Q_Tried)*100 Total_Correct_Answer_Last20_Percentage
FROM (
   SELECT
     B.category_id, COUNT(B.question_id) Total_Q_Tried, 
     COUNT(DISTINCT B.question_id) Total_Unique_Q_Tried,
     SUM(A.correct) Total_Answered_Correctly,

     (SELECT length(SUBSTRING_INDEX(GROUP_CONCAT(AA.correct ORDER BY AA.date_answered DESC SEPARATOR ',' ), ',', 20))
           - length(replace(SUBSTRING_INDEX(GROUP_CONCAT(AA.correct ORDER BY AA.date_answered DESC SEPARATOR ',' ), ',', 20),'1', ''))
      FROM tableA AA INNER JOIN tableB BB ON AA.question_id = BB.question_id
      WHERE BB.category_id = B.category_id
           AND AA.user_id = A.user_id
     ) Total_Answered_Correctly_In_Last20

    FROM tableA A LEFT JOIN tableB B
          ON B.question_id = A.question_id
    WHERE A.user_id = 34
    GROUP BY B.category_id ) FinalNumbers

如果您想要在过去二十年中使用正确答案的百分比,则需要使用较小的20和查询中计算的TOTAL_Q_TRIEDTOTAL_ANSWERED_CORRECTLY_IN_LAST20

-

我无法尝试,但如果有很多行,性能可能不会很好。

| USER_ID | QUESTION_ID |                  DATE_ANSWERED | CORRECT |
|---------|-------------|--------------------------------|---------|
|      66 |           1 | January, 01 2013 00:00:00+0000 |       1 |
|      34 |           1 | January, 02 2013 00:00:00+0000 |       1 |
|      34 |           2 | January, 03 2013 00:00:00+0000 |       1 |
|      34 |           3 | January, 04 2013 00:00:00+0000 |       0 |
|      34 |           4 | January, 05 2013 00:00:00+0000 |       1 |
|      34 |           6 | January, 06 2013 00:00:00+0000 |       0 |


| CATEGORY_ID | QUESTION_ID |
|-------------|-------------|
|           1 |           1 |
|           2 |           2 |
|           2 |           3 |
|           2 |           4 |
|           2 |           5 |
|           3 |           6 |


| CATEGORY_ID | TOTAL_Q_TRIED | TOTAL_UNIQUE_Q_TRIED | TOTAL_ANSWERED_CORRECTLY | TOTAL_CORRECT_ANSWER_PERCENTAGE | TOTAL_ANSWERED_CORRECTLY_IN_LAST20 | TOTAL_CORRECT_ANSWER_LAST20_PERCENTAGE |
|-------------|---------------|----------------------|--------------------------|---------------------------------|------------------------------------|----------------------------------------|
|           1 |             1 |                    1 |                        1 |                             100 |                                  1 |                                    100 |
|           2 |             3 |                    3 |                        2 |                         66.6667 |                                  2 |                                66.6667 |
|           3 |             1 |                    1 |                        0 |                               0 |                                  0 |                                      0 |

以下评论 - 添加正确回答的唯一问题总数。

这变得更加艰难和艰难。我正在加入每一列,包括添加的最新查询中的时间戳,以获得唯一的答案。见下文。

SELECT category_id,
       Total_Q_Tried,
       Total_Unique_Q_Tried,
       Total_Answered_Correctly,
       Total_Unique_Answered_Correctly,
       Total_Answered_Correctly / Total_Q_Tried*100 Total_Correct_Answer_Percentage,
       Total_Answered_Correctly_In_Last20,
       Total_Answered_Correctly_In_Last20 / LEAST(20,Total_Q_Tried)*100 Total_Correct_Answer_Last20_Percentage
FROM (
  SELECT
     B.category_id, COUNT(B.question_id) Total_Q_Tried, 
     COUNT(DISTINCT B.question_id) Total_Unique_Q_Tried,
     SUM(A.correct) Total_Answered_Correctly,
     SUM(UniqueA.correct) Total_Unique_Answered_Correctly,


     (SELECT length(SUBSTRING_INDEX(GROUP_CONCAT(AA.correct ORDER BY AA.date_answered DESC SEPARATOR ',' ), ',', 20))
           - length(replace(SUBSTRING_INDEX(GROUP_CONCAT(AA.correct ORDER BY AA.date_answered DESC SEPARATOR ',' ), ',', 20),'1', ''))
      FROM tableA AA INNER JOIN tableB BB ON AA.question_id = BB.question_id
      WHERE BB.category_id = B.category_id
           AND AA.user_id = A.user_id
     ) Total_Answered_Correctly_In_Last20

  FROM tableA A LEFT JOIN tableB B
        ON B.question_id = A.question_id
       LEFT JOIN (select user_id, question_id,  MAX(date_answered) date_answered, correct
                    from tableA
                    GROUP BY user_id, question_id, correct
                   ) UniqueA
        ON A.user_id = UniqueA.user_id AND A.question_id = UniqueA.question_id AND  A.date_answered = UniqueA.date_answered
  WHERE A.user_id = 34
  GROUP BY B.category_id ) FinalNumbers;

对于正确回答的最后20个问题的%,这可能无效。请测试一下。如果不替换,则tableA AtableA AAUniqueA的选择查询仅处理唯一答案并删除最新添加的左连接。

答案 2 :(得分:0)

您需要添加一个子查询,返回按时间戳排序的LIMIT 20