插入查询替换Cassandra集群列中具有相同数据字段的行

时间:2017-07-27 06:21:34

标签: cassandra

我正在学习Cassandra,从v3.8开始。我的示例键空间/表看起来像这样

CREATE TABLE digital.usage (
    provider decimal,
    deviceid text,
    date text,
    hours varint,
    app text,
    flat text,
    usage decimal,
    PRIMARY KEY ((provider, deviceid), date, hours)
) WITH CLUSTERING ORDER BY (date ASC, hours ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';

使用带有分区键的复合PRIMARY KEY作为provider和deviceId,以便跨群集节点完成唯一性和分布。然后,聚类键是日期和小时。

我几乎没有观察到:

1)对于PRIMARY KEY((provider, deviceid), date, hours),在为小时字段插入多个条目时,仅记录最新记录,之前的记录将消失。

2)对于PRIMARY KEY((provider, deviceid), date),在为同一日期字段插入多个条目时,仅记录最新记录,之前的记录将消失。

虽然我对上述(第1点)行为感到满意,但想知道后台发生的事情。我是否必须了解有关群集顺序键的更多信息?

1 个答案:

答案 0 :(得分:2)

PRIMARY KEY意味着独一无二。

如果在PRIMARY KEY中插入重复值,大多数RDBMS都会抛出错误。

Cassandra在写之前没有做过Read。它创建了具有最新时间戳的记录的新版本。当您为主键中的列插入具有相同值的数据时,将使用最新时间戳创建新数据,同时返回仅返回最新时间戳的查询(SELECT)记录。

示例:

PRIMARY KEY((provider, deviceid), date, hours)
Insert into digital.usage(provider, deviceid, date, hours,app,flat) values(1.0,'a','2017-07-27',1,"test","test") 
 ---- This will create a new record with let's say timestamp as 1
Insert into digital.usage(provider, deviceid, date, hours,app,flat) values(1.0,'a','2017-07-27',1,"test1","test1") 
 ---- This will create a new record with let's say timestamp as 2

SELECT app,flat FROM digital.usage WHERE provider=1.0 AND deviceid='a' AND date='2017-07-27' AND hours=1

Will give 
 ------------
| app | flat | 
|-----|------|
|test1|test1 |
 ------------