我有两张桌子
CREATE TABLE IF NOT EXISTS QueueBucket (
queueName text,
bucketId int,
scheduledMinute timestamp,
scheduledTime timestamp,
messageId uuid,
PRIMARY KEY ((queueName, bucketId, scheduledMinute), scheduledTime, messageId)
) WITH compaction = { 'class' : 'LeveledCompactionStrategy' } AND speculative_retry='NONE' ;
CREATE TABLE IF NOT EXISTS InDelivery (
queueName text,
nodeId uuid,
dequeuedMinute timestamp,
messageId uuid,
bucketid int,
dequeuedTime timestamp,
PRIMARY KEY ((queueName, nodeId,bucketId, dequeuedMinute),dequeuedTime, messageId)
);
在代码中,我执行插入QueueBucket并从批处理(已记录)中删除。但是在负载测试期间,虽然插入到QueueBucket中,但是从unslivery中删除有时并不起作用。要确认这是应用了一个读取来自不完整检查,之后如果messageId仍然存在则读取已删除的messageId打印WARN日志。
queueDao.insertMsgInfo(queueName, bucketId, QueueUtils.getMinute(scheduledTime), scheduledTime, messageId);
queuDao.deleteInDelivery(queueName, nodeId, bucketId, bucketMinute, dequeuedTime, messageId);
if(queueServiceMetaDao.hasIndeliveryMessage(inDeliveryPK)) {
log.warn("messageId {} of queue {} bucket {} with node {} dequuedTime {} dequeud minute {} could not get deleted from indelivery.",
messageId,queueName,bucketId, nodeId,QueueUtils.dateToString(dequeuedTime),QueueUtils.dateToString(bucketMinute));
}
在insertMsgInfo和deleteInDelivery方法中的我正在重用预准备语句。
"INSERT INTO queuebucket (queuename, bucketid , scheduledminute, scheduledtime, messageid ) VALUES ( ? , ? , ? , ? , ? );"
"DELETE FROM indelivery WHERE queuename = ? AND nodeId = ? AND bucketId=? AND dequeuedMinute=? AND dequeuedTime =? AND messageId=? ;"
hasIndeliveryMessage中的我传递了相同的值包装到inDeliveryPrimaryKey中,因为我在moveBackToQueueBucket方法中删除了不确定数据。
"SELECT messageId FROM indelivery WHERE queuename = ? AND nodeId = ? AND bucketId=? AND dequeuedMinute=? AND dequeuedTime=? AND messageId=? ;"
我不知道为什么我会看到多条警告信息"无法从保密信息中删除。" 。请帮忙
我正在使用cassandra 2.2.7版它是6节点cassandra集群 复制因子5和使用的读写一致性是QUORUM。
我还浏览了链接Cassandra - deleted data still there和https://issues.apache.org/jira/browse/CASSANDRA-7810 但是很久以前就修好了这个问题。在2.0.11。
进一步更新根据Cassandra - Delete not working我还运行了nodetool修复,但问题仍然存在。 我也应该运行紧凑型吗?
进一步更新: 我不再使用批处理我做简单的插入到queuebucket并删除为unslivery然后读取数据但仍然问题仍然存在
添加一些日志:
2016-07-19 20:39:42,440[http-nio-8014-exec-12]INFO QueueDaoImpl -deleting from indelivery queueName pac01_deferred nodeid 1349d57f-28f5-37d4-9fe1-dfa14dba4a9f bucketId 382 dequeuedMinute 20160719203900000 dequeuedTime 20160719203942310 messageId cc4fb158-f61e-345b-8dcf-3f842fe52d50:
2016-07-19 20:39:42,442[http-nio-8014-exec-12]INFO QueueDaoImpl -Reading from indelivery : queue pac01_deferred nodeId 1349d57f-28f5-37d4-9fe1-dfa14dba4a9f dequeueMinute 20160719203900000 dequeueTime 20160719203942310 messageid cc4fb158-f61e-345b-8dcf-3f842fe52d50 bucketId 382 indeliveryRow Row[cc4fb158-f61e-345b-8dcf-3f842fe52d50]
2016-07-19 20:39:42,442[http-nio-8014-exec-12]WARN QueueImpl -messageId cc4fb158-f61e-345b-8dcf-3f842fe52d50 of queue pac01_deferred bucket 382 with node 1349d57f-28f5-37d4-9fe1-dfa14dba4a9f dequuedTime 20160719203942310 dequeud minute 20160719203900000 could not get deleted from indelivery .
我应该尝试一致性ALL ???
答案 0 :(得分:1)
首先,使用Cassandra来支持队列或队列式结构是一种已知的反模式。如果你的队列处理高吞吐量,那么你将会遇到墓碑并降低查询性能。
至于你的实际问题,我之前已经看到过这种情况发生在使用时间戳作为键的模型上。您如何为dequeuedMinute
和dequeuedTime
创建时间戳值?
如果您自己将时间戳放在一起,那么删除它们应该很容易。但是,如果您使用dateOf(now())
或Java.Util.Date
创建它们,那么您的时间戳将与它们一起存储毫秒数。虽然cqlsh会掩盖你的意思:
INSERT INTO InDelivery (queuename, nodeid, bucketid , dequeuedMinute, dequeuedTime, messageid )
VALUES ('test1',uuid(),2112,dateof(now()),dateof(now()),uuid());
INSERT INTO InDelivery (queuename, nodeid, bucketid , dequeuedMinute, dequeuedTime, messageid )
VALUES ('test1',a24e056a-94fa-4aee-b3a7-a8df6060091a,2112,'2016-07-19 09:57:16-0500','2016-07-19 09:57:16-0500',uuid());
SELECT queuename,nodeid,dequeuedMinute,blobasbigint(timestampasblob(dequeuedMinute)),
dequeuedTime,blobasbigint(timestampasblob(dequeuedTime)),messageid
FROM InDelivery;
queuename | nodeid | dequeuedMinute | blobasbigint(timestampasblob(dequeuedMinute)) | dequeuedTime | blobasbigint(timestampasblob(dequeuedTime)) | messageid
-----------|--------------------------------------+-------------------------------+-----------------------------------------------+--------------------------+--------------------------------------+---------------------------------------------
test1 | a24e056a-94fa-4aee-b3a7-a8df6060091a | 2112 2016-07-19 09:57:16-0500 | 1468940236000 | 2016-07-19 09:57:16-0500 | 1468940236000 | 7ca1f676-9034-45ba-bb3f-377ba74cc5c0
test1 | a24e056a-94fa-4aee-b3a7-a8df6060091a | 2112 2016-07-19 09:57:16-0500 | 1468940236641 | 2016-07-19 09:57:16-0500 | 1468940236641 | 9721d96e-d6f5-43a7-9ba4-18ef4d54ab8a
(2 rows)
那些时间戳看起来一样,对吧?但应用blobasbigint(timestampasblob(
嵌套函数可以发现差异(000与641毫秒)。
请注意,如果我将SELECT
更改为在641毫秒(blobasbigint(timestampasblob(
列中的最后3位数)上进行过滤,则会得到具有毫秒数的行。
SELECT queuename,nodeid,dequeuedMinute,blobasbigint(timestampasblob(dequeuedMinute)),
dequeuedTime,blobasbigint(timestampasblob(dequeuedTime)),messageid
FROM InDelivery
WHERE queuename='test1' AND bucketid=2112
AND nodeid=a24e056a-94fa-4aee-b3a7-a8df6060091a
AND dequeuedMinute='2016-07-19 09:57:16.641-0500';
queuename | nodeid | dequeuedMinute | blobasbigint(timestampasblob(dequeuedMinute)) | dequeuedTime | blobasbigint(timestampasblob(dequeuedTime)) | messageid
-----------|--------------------------------------+-------------------------------+-----------------------------------------------+--------------------------+--------------------------------------+---------------------------------------------
test1 | a24e056a-94fa-4aee-b3a7-a8df6060091a | 2112 2016-07-19 09:57:16-0500 | 1468940236641 | 2016-07-19 09:57:16-0500 | 1468940236641 | 9721d96e-d6f5-43a7-9ba4-18ef4d54ab8a
(1 rows)
最重要的是,如果您要使用时间戳密钥存储毫秒数,那么您还需要在SELECT/DELETE
这些密钥时包含它们。同样,如果您不在时间戳密钥上存储毫秒,那么当您SELECT/DELETE
这些密钥时,无法包含它们。
答案 1 :(得分:1)
在客户端使用TIMESTAMP解决了我的问题。感谢Mikhail Baksheev指出。
建议在查询中从客户端使用它来维护变异的顺序。
如果我们要插入和删除数据,请确保我们传入的TIMESTAMP值删除查询必须超过我们在插入时传递的值。
Cassandara中数据删除失败/似乎失败的其他原因可能是