通过阅读进度表中的选择查询来计算阅读页面

时间:2019-07-11 09:31:08

标签: mysql sql database

我有一个小程序,可用来跟踪我在读书和阅读诸如Goodreads之类的过程中的进度,以了解每天的阅读量。
为此,我创建了两个表,tbl_materials(material_id int,名称varchar),tbl_progress(date_of_update时间戳,material_id int外键,read_pages int,跳过位)。
每当我阅读一些页面时,我都会将我已经完成的当前页面插入tbl_progress
我可能会在书中多次阅读。而且,如果我跳过了一些页面,则将它们插入tbl_progress,并将位skipped标记为true。问题是我无法查询tbl_progress知道我每天读多少书

我想尝试的是找到每一天每种材料的最新插入进度 因此,例如: +-------------+------------+---------+---------------------+ | material_id | read_pages | skipped | last_update | +-------------+------------+---------+---------------------+ | 4 | 1 | | 2017-09-22 00:56:02 | | 3 | 1 | | 2017-09-22 00:56:14 | | 12 | 1 | | 2017-09-24 20:13:01 | | 4 | 30 | | 2017-09-25 01:56:38 | | 4 | 34 | | 2017-09-25 02:19:47 | | 54 | 1 | | 2017-09-29 04:22:11 | | 59 | 9 | | 2017-10-14 15:25:14 | | 4 | 68 | T | 2017-10-18 02:33:04 | | 4 | 72 | | 2017-10-18 03:50:51 | | 2 | 3 | | 2017-10-18 15:02:46 | | 2 | 5 | | 2017-10-18 15:10:46 | | 4 | 82 | | 2017-10-18 16:18:03 | | 4 | 84 | | 2017-10-20 18:06:40 | | 4 | 87 | | 2017-10-20 19:11:07 | | 4 | 103 | T | 2017-10-21 19:50:29 | | 4 | 104 | | 2017-10-22 19:56:14 | | 4 | 108 | | 2017-10-22 20:08:08 | | 2 | 6 | | 2017-10-23 00:35:45 | | 4 | 111 | | 2017-10-23 02:29:32 | | 4 | 115 | | 2017-10-23 03:06:15 | +-------------+------------+---------+---------------------+ 我计算了每天的总阅读页数=该天的最后阅读页数-该日期之前的某个日期的最后阅读页数,这种方法有效,但是问题是我无法避免跳过页面。
2017-09-22中的第一行,我读了一页,然后又读了另一页,所以这一天的总读取数= 2(仅material_id = 4)
在2017-09-25中material_id 4的最新更新为34页,这意味着我阅读了34-1 = 33页(这一天的最新更新34-该日期1之前的最新更新)= 33
到目前为止,一切正常,但是考虑到跳过页面,例如,我做不到:
在2017-10-18中,material_id = 4的最后读取页面数是34(在2017-09-25),然后我跳过了34页,现在当前页面是68然后读取了4页(2017-10-18 03: 50:51),然后再添加10页(2017-10-18 16:18:03),因此material_id = 4的总数为14

我创建了一个视图,以选择每天每一本书的最新last_update

create view v_mostRecentPerDay as
select material_id                                                    id,
       (select title from materials where materials.material_id = id) title,
       completed_pieces,
       last_update,
       date(last_update)                                              dl,
       skipped
from progresses
where last_update = (
    select max(last_update)
    from progresses s2
    where s2.material_id = progresses.material_id
      and date(s2.last_update) = date(progresses.last_update)
      and s2.skipped = false
);

因此,如果一天中单本书的更新很多,则此视图将检索最近一本书(最大为last_update),该书伴随着最大的阅读页数,因此每本书 另一个视图来获取每天的总阅读页数:

create view v_totalReadInDay as
select dl, sum(diff) totalReadsInThisDay
from (
         select dl,
                completed_pieces - ifnull((select completed_pieces
                                           from progresses
                                           where material_id = id
                                             and date(progresses.last_update) < dl
                                           ORDER BY last_update desc
                                           limit 1
                                          ), 0) diff
         from v_mostRecentPerDay
         where skipped = false
     ) omda
group by dl;

,但问题是最后一个视图计算了跳过的页面。
预期结果:

+------------+------------------+
| day        | total_read_pages |
+------------+------------------+
| 2017-09-22 | 2                |
+------------+------------------+
| 2017-09-24 | 1                |
+------------+------------------+
| 2017-09-25 | 33               |
+------------+------------------+
| 2017-09-29 | 1                |
+------------+------------------+
| 2017-10-14 | 9                |
+------------+------------------+
| 2017-10-18 | 19               |
+------------+------------------+
| 2017-10-20 | 5                |
+------------+------------------+
| 2017-10-21 | 0                |
+------------+------------------+
| 2017-10-22 | 21               |
+------------+------------------+
| 2017-10-23 | 8                |
+------------+------------------+
mysql> SELECT VERSION();
+-----------------------------+
| VERSION()                   |
+-----------------------------+
| 5.7.26-0ubuntu0.16.04.1-log |
+-----------------------------+

2 个答案:

答案 0 :(得分:1)

这似乎是一种超级复杂的方法来评估每天阅读的页面。您是否考虑过对数据进行非规范化处理并存储当前页面和已读取的页面数?

当前页面可能更有意义地存储在物料表或单独的书签表中,例如

  • bookmark-id,material_id,page_number
  • reading-id,bookmark_id,pages_complete,was_skipped,end_at

阅读(或跳过!)会话完成后,可以轻松地从当前页面减去书签中的旧当前页面来计算pages_complete,这可以在您的应用程序逻辑中完成

您每天的页面查询变得简单

SELECT SUM(pages_complete) pages_read
  FROM reading
 WHERE ended_at >= :day
   AND ended_at < :day + INTERVAL 1 DAY
   AND was_skipped IS NOT TRUE

答案 1 :(得分:0)

您可以使用表进度的同一列+使用与@Arth建议的{pages_completed列相同的思想的另一个派生列进行查看。
此列将包含当前的completed_pages-已完成的页面,其中最后一个更新在前一个完成的页面之前,这是不同的。
因此,例如,如果您的进度表是这样的:

+-------------+------------+---------+---------------------+
| material_id | read_pages | skipped | last_update         |
+-------------+------------+---------+---------------------+
|           4 |         68 | T       | 2017-10-18 02:33:04 |
|           4 |         72 |         | 2017-10-18 03:50:51 |
|           2 |          3 |         | 2017-10-18 15:02:46 |
|           2 |          5 |         | 2017-10-18 15:10:46 |
|           4 |         82 |         | 2017-10-18 16:18:03 |
+-------------+------------+---------+---------------------+

我们将添加另一个名为diff的派生列。
read_pages-2017-10-18 02:33:04read_pages之前的差异2017-10-18 02:33:04

+-------------+------------+---------+---------------------+------------------+
| material_id | read_pages | skipped | last_update         | Derived_col_diff |
+-------------+------------+---------+---------------------+------------------+
|             | 68         | T       | 2017-10-18T02:33:04 | 68 - null = 0    |
| 4           |            |         |                     |                  |
+-------------+------------+---------+---------------------+------------------+
| 4           | 72         |         | 2017-10-18T03:50:51 | 72 - 68 = 4      |
+-------------+------------+---------+---------------------+------------------+
| 2           | 3          |         | 2017-10-18T15:02:46 | 3 - null = 0     |
+-------------+------------+---------+---------------------+------------------+
| 2           | 5          |         | 2017-10-18T15:10:46 | 5 - 3 = 2        |
+-------------+------------+---------+---------------------+------------------+
| 4           | 82         |         | 2017-10-18T16:18:03 | 82 - 72 = 10     |
+-------------+------------+---------+---------------------+------------------+

请注意:68 - null为空,但为澄清起见我将其设置为0
此处的派生列是此read_pages与该read_pages之前的read_pages之间的区别。
这是一个视图

create view v_progesses_with_read_pages as
select s0.*,
       completed_pieces - ifnull((select completed_pieces
                                  from progresses s1
                                  where s1.material_id = s0.material_id
                                    and s1.last_update = (
                                      select max(last_update)
                                      FROM progresses s2
                                      where s2.material_id = s1.material_id and s2.last_update < s0.last_update
                                  )), 0) read_pages
from progresses s0;

然后,您可以每天选择此派生列的总和:

select date (last_update) dl, sum(read_pages) totalReadsInThisDay from v_progesses_with_read_pages where skipped = false group by dl;

这将导致如下结果:

+-------------+-----------------------------+
| material_id | totalReadsInThisDay         |
+-------------+-----------------------------+
| 2017-10-18  | 16                          |
+-------------+-----------------------------+
| 2017-10-19  | 20 (just for clarification) |
+-------------+-----------------------------+

请注意,最后一行是我的主意