Cassandra:使用UPDATE IF EXISTS创建的新行

时间:2017-03-03 15:36:03

标签: cassandra

我有一张桌子 - 为简单起见,我们可以说这是它的定义:

CREATE TABLE t (pk1 varchar, pk2 varchar, c1 varchar, c2 varchar, PRIMARY KEY(pk1, pk2));

我使用完整的PK对并行执行多项操作:

INSERT INTO t (pk1, pk2, c1, c2) values (?, ?, ?, ?) IF NOT EXISTS;

DELETE FROM t where pk1 = ? AND pk2 = ?;

UPDATE t set c1 = ? where pk1 = ? AND pk2 = ? IF EXISTS;

注意:

    INSERT命令中的
  1. c2永远不为空
  2. 在UPDATE命令中c2未填充
  3. 使用这些命令我永远不应该有c2 = null的行。问题是,我偶尔会看到这样的行。我不能轻易地重现它,但它总是发生在我强调系统时(多个并行客户端运行:插入,更新,使用相同的PK删除)。

    编辑:我的群集大小为4,RF = 2(NetworkTopologyStrategy为1 DC),我使用CL = QUORUM进行所有查询。

    我错过了什么或者LWT中有错误吗?

2 个答案:

答案 0 :(得分:0)

一般来说,一个选项是删除与Update并行执行,但不提供任何保证。

示例(这是为了简单起见 - 也可以在其他选项中使用。)

  1. 假设一组3个节点,RF = 3(与node1和另一个node2,node3有临时连接问题)

  2. 使用时间戳T1向node1执行“删除CL = ONE”(并未在node2,node3上应用)。

  3. 更新以node2,node3执行,时间戳为T2(T2> T1)。

  4. node1,node2,node3之间的连接已恢复,现在DELETE引入的逻辑删除将删除所有数据(包括c2),UPDATE只设置pk1,pk2,c1 - 将c2保留为null。 / p>

  5. 如果你使用LWT应用DELETE - 只要不使用TTL就不会发生这种情况。

    TTL可以直接在insert语句中设置,也可以默认通过table属性设置,以检查这是否可以执行

    1. 描述表将返回为此表设置的default_time_to_live。

    2. 如果设置了ttl,则select ttl(c2) ...将返回一个值。

答案 1 :(得分:0)

https://docs.datastax.com/en/cassandra/3.0/cassandra/dml/dmlLtwtTransactions.html

If lightweight transactions are used to write to a row within a partition,only lightweight transactions for both read and write operations should be used. This caution applies to all operations, whether individual or batched. For example, the following series of operations can fail:

DELETE ...
INSERT .... IF NOT EXISTS
SELECT ....

The following series of operations will work:

DELETE ... IF EXISTS
INSERT .... IF NOT EXISTS
SELECT .....

注意-对于INSERT和UPDATE组合以及最近遇到的错误也是如此。如果使用事务,则将其用于相关语句。最近的时间可能与时间戳略有不同有关,在这里可以更好地解释

https://jira.apache.org/jira/browse/CASSANDRA-14304

doanduyhai DOAN DuyHai added a comment - 24/Mar/18 15:12
Hints:

1) LWT operations are using the ballot based on an agreement of timestamp value between QUORUM of replicas. It can happens that the timestamp is slightly incremented (some microsecs) in case of conflict/contention on the cluster. The consequence is that the timestamp used for LWT can be slightly (again in microsecs) in the future. Read the source code to check the part of the code responsible for ballot agreement with Paxos

2) For the DELETE request:

   a) it can use the <current> timestamp, which can belong to the "past" with respect to the one used by LWT, thus SELECT does return a value