Question

我有一个设备表（例如＆＃39;设备＆＃39;表），其中包含具有当前统计信息的静态字段，而我有另一个表（例如＆＃39; devicestat＆＃39;表），其中包含统计信息该设备每隔一分钟收集一次，并按如下所示的时间戳排序。

示例：

CREATE TABLE device(
   "partitionId" text,
   "deviceId" text,
   "name" text,
   "totalMemoryInMB" bigint,
   "totalCpu" int,
   "currentUsedMemoryInMB" bigint,
   "totalStorageInMB" bigint,
   "currentUsedCpu" int,
   "ipAddress" text,
    primary key ("partitionId","deviceId"));


CREATE TABLE devicestat(
   "deviceId" text,
   "timestamp" timestamp,
   "totalMemoryInMB" bigint,
   "totalCpu" int,
   "usedMemoryInMB" bigint,
   "totalStorageInMB" bigint,
   "usedCpu" int
    primary key ("deviceId","timestamp"));

其中，

currentUsedMemoryInMB & currentUsedCpu => Hold the most recent statistics

usedMemoryInMB & usedCpu => Hold the most and also old statistics based on time stamp.

有人可以建议我采用以下概念的正确方法吗？

因此，每当我需要使用 device 表中的最新统计数据获取静态数据时，每当我需要设备统计数据的历史记录时，我都会从 {{1}中读取} 表

这对我来说很好，但唯一的问题是我需要在两个表中编写统计信息，如果是 devicestat 表这将是一个基于时间戳的新条目但是如果是 devicestat 表，我们只会更新统计信息。您对此有何看法，是否需要仅在单个统计表中维护，或者也可以更新设备表中的最新统计信息。

Answer 1

在Cassandra中，常见的方法是每个查询都有一个表（ColumnFamily）。非正规化在卡桑德拉也是一种很好的做法。因此，在这种情况下保留2列系列是可以的。

从devicestat表获取最新统计信息的另一种方法是使数据按时间戳排序为DESC：

CREATE TABLE devicestat(
   "deviceId" text,
   "timestamp" timestamp,
   "totalMemoryInMB" bigint,
   "totalCpu" int,
   "usedMemoryInMB" bigint,
   "totalStorageInMB" bigint,
   "usedCpu" int
    primary key ("deviceId","timestamp"))
WITH CLUSTERING ORDER BY (timestamp DESC);

因此，当您了解deviceId

时，可以使用limit 1进行查询

select * from devicestat where deviceId = 'someId' limit 1;

但是如果你想通过partitionId列出设备的最后统计数据，那么用更新设备表更新设备表的方法是正确的

Cassandra中针对指定用例的规范化？

1 个答案: