Question

我们有一个聊天系统，其中有一个分析仪表板。目前，我们正在显示最上面所说的句子。该模型如下所示：


messages
    --key1
       -text: "who are you"
    --key2
       -text: "hello"
    --key3
       -text: "who are you"

有一个数据库触发器，每当插入一条新消息时，都会存储如下所示的计数


stat
   --topPhrases
     --keyA
        --phrase: "who are you"
        --count: 2
     --key
        --phrase: "hello"
        --count: 1

我们的仪表板现在查询此数据，并在仪表板上显示为所用的热门句子。

我们现在遇到的问题是，我们需要向其添加date元素。因此，目前基本上，这可以解决“人们曾经说过的最热门的句子”

我们现在要回答的是“今天，本周，本月最热门的句子”

因此，我们可能需要以不同的方式重新存储统计数据模型。请告知。

Answer 1

通常的建议是存储应用程序需要显示的数据。因此，如果要显示今天，本周和本月的热门句子，则意味着要精确存储这些汇总：按日，周和月排列的热门句子。

存储这些数据的简单模型是保持当前状态，但随后针对每个聚合级别和每个时间间隔：

stats
   --topPhrases
     --keyA
        --phrase: "who are you"
        --count: 2
     --key
        --phrase: "hello"
        --count: 1
   --topPhrases_byDay
     --20190607
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     --20190607
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
   --topPhrases_byWeek
     --201922
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     --201923
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
   --topPhrases_byMonth
     --201905
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     --201906
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1

或者，将所有聚合存储为单个列表，并使用前缀指示其聚合级别（以及密钥其余部分的格式）：

stats
   --topPhrases
     --keyA
        --phrase: "who are you"
        --count: 2
     --key
        --phrase: "hello"
        --count: 1
     day_20190607
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     day_20190608
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     week_201922
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     week_201923
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     month_201905
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1
     month_201906
        --keyA
           --phrase: "who are you"
           --count: 2
        --key
           --phrase: "hello"
           --count: 1

您肯定在这里复制了很多数据，但是这些模型的优点是现在向用户显示统计信息变得微不足道了。这是NoSQL数据库的常见折衷方案，数据写入变得更加复杂，并且存储了更多（重复的）数据，但是读取数据变得微不足道，因此具有很高的可伸缩性。

实时数据库数据结构建模

1 个答案: