应用错误收集

从SSTable读取时，Cassandra如何处理重复数据

时间：2018-12-05 07:37:59

标签： cassandra datastax

在Datastax的documentation中，它说：

在写操作期间，Cassandra将每个新行添加到数据库中，而无需检查是否存在重复记录。这项政策使它数据库中可能存在同一行的许多版本。

据我了解，这意味着可能有不止1个未压缩的SSTable，其中包含同一行的不同版本。 Cassandra从这些SSTable读取数据时如何处理重复的数据？

1 个答案:

答案 0 :(得分：1)

@quangh：如文档中所述：

This is why Cassandra performs another round of comparisons during a read process. When a client requests data with a particular primary key, Cassandra retrieves many versions of the row from one or more replicas. The version with the most recent timestamp is the only one returned to the client ("last-write-wins").

所有写操作都有一个关联的时间戳。在这种情况下，不同的节点将具有同一行的不同版本。但是在读取操作期间，Cassandra将选择具有最新时间戳的行。我希望这可以解决您的查询。