我已经创建了一个像这样的cassandra表,其中包含大量信息:
CREATE TABLE keyspace.table1 (
uuid blob,
id bigint,
timestamp bigint,
description text,
option1 double,
PRIMARY KEY (uuid, id) ) WITH CLUSTERING ORDER BY (id ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
我正在尝试在其上运行nodetool cfstats来确定行数。我在线搜索,似乎键数(估计)应该是行数。但是,如下所示,这个数字非常低,所以我知道这不对。我做错了什么?
Table: table1
SSTable count: 3
Space used (live): 195.02 MB
Space used (total): 195.02 MB
Space used by snapshots (total): 567.99 KB
Off heap memory used (total): 61.83 KB
SSTable Compression Ratio: 0.3936987749701019
Number of keys (estimate): 19
Memtable cell count: 612048
Memtable data size: 14.18 MB
Memtable off heap memory used: 0 bytes
Memtable switch count: 6
Local read count: 2657130
Local read latency: 0.055 ms
Local write count: 2409743
Local write latency: 0.017 ms
Pending flushes: 0
Bloom filter false positives: 0
Bloom filter false ratio: 0.00000
Bloom filter space used: 64 bytes
Bloom filter off heap memory used: 40 bytes
Index summary off heap memory used: 84 bytes
Compression metadata off heap memory used: 61.71 KB
Compacted partition minimum bytes: 49.82 KB
Compacted partition maximum bytes: 85.8 MB
Compacted partition mean bytes: 27.06 MB
Average live cells per slice (last five minutes): 1.0160752060827343
Maximum live cells per slice (last five minutes): 5722
Average tombstones per slice (last five minutes): 1.0
Maximum tombstones per slice (last five minutes): 1
如果无法做到这一点,是否有另一种获取表格行数的方法?
由于
答案 0 :(得分:1)
从您的架构中,您的分区键是您的uuid列。每个分区键都是Cassandra存储引擎的“行”。所以cfstats只是输出为该表存储的分区键数(当然估计)。
我会检查并查看系统中有多少个不同的UUID,如果它大约是19,那么一切都很好。
答案 1 :(得分:1)
它不是"行"的数量,它是键或分区的数字。在您的数据模型中,它将是唯一{{1}}的数量。请注意,对于2.0,这个数字可以稍微偏离,它将总结所有sstables中的分区数。 Post 2.1.6它将合并一个hyperloglog结构,因此跨sstables的重复不会影响它。
要获取实际需要读取数据的CQL行,可以使用count或火花作业,这些都很昂贵,因此可能需要考虑保留一个带有计数器的替代表。