我担心“压缩分区最大字节数”的值,因为它只有89MB很高。
这是否表明模型损坏或其他问题? 应用程序端没有发现问题。
使用 week_first_day,device_id 分区键,将存储在表中的数据打包到每个设备的每周存储区中。
表的数据模型:
CREATE TABLE device_data (
week_first_day timestamp,
device_id uuid,
nano_since_epoch bigint,
sensor_id uuid,
source text,
unit text,
username text,
value double,
PRIMARY KEY ((week_first_day, device_id), nano_since_epoch, sensor_id)
)
nodetool cfstats
Table: device_data
SSTable count: 5
Space used (live): 447558297
Space used (total): 447558297
Space used by snapshots (total): 0
Off heap memory used (total): 211264
SSTable Compression Ratio: 0.2610509614736755
Number of partitions (estimate): 939
Memtable cell count: 458
Memtable data size: 63785
Memtable off heap memory used: 0
Memtable switch count: 0
Local read count: 0
Local read latency: NaN ms
Local write count: 458
Local write latency: 0.058 ms
Pending flushes: 0
Percent repaired: 99.83
Bloom filter false positives: 0
Bloom filter false ratio: 0.00000
Bloom filter space used: 2216
Bloom filter off heap memory used: 2176
Index summary off heap memory used: 672
Compression metadata off heap memory used: 208416
Compacted partition minimum bytes: 43
Compacted partition maximum bytes: 89970660
Compacted partition mean bytes: 1100241
Average live cells per slice (last five minutes): NaN
Maximum live cells per slice (last five minutes): 0
Average tombstones per slice (last five minutes): NaN
Maximum tombstones per slice (last five minutes): 0
Dropped Mutations: 0
答案 0 :(得分:0)
这实际上取决于该分区中数据的访问模式-如果您经常读取整个分区,则可能会引起问题,但是如果您仅读取其中的一部分,则不应一个问题。例如,您可以通过将日期用作时段来分解分区。
请看2年前Cassandra Summit上的演讲Myths of Big Partitions-它详细介绍了如何在Cassandra 3.x中处理该问题。