Cassandra table design for user chat

时间:2018-07-25 05:18:09

标签: database database-design cassandra nosql

I want to create table in Cassandra for user chats, I end up doing this

CREATE TABLE sample.user_messages (
    user_id INT,
    second_user_id INT,
    id TIMEUUID,
    author_id INT,
    message TEXT,
    PRIMARY KEY ((user_id), second_user_id, id)
) WITH CLUSTERING ORDER BY (second_user_id ASC, message_id DESC);

I have two type of query

  1. get chats between two users that this table design satisfy ... where user_id=100 and second_user_id=200

  2. get all chats of a specific user that this table design not good for and I don't have any idea what to do, for this should I use two queries, 1- ... where user_id=100' 2- ... where second_user_id=100 which second query is not good, also is there any way where I can use only one query

3 个答案:

答案 0 :(得分:3)

您的表允许您按user_id获取所有聊天记录,因此您只需将数据两次插入到此表中,即可更改第二次插入的用户ID。

为第一个用户输入消息:

UPDATE user_messages SET .... second_user_id = 200 WHERE user_id = 100;

并向第二个用户发送相同的消息:

UPDATE user_messages SET .... second_user_id = 100 WHERE user_id = 200;

现在您可以获取每个用户的所有聊天记录了

Select * from user_messages where user_id = 100;
Select * from user_messages where user_id = 200;

在两个用户之间聊天:

Select * from user_messages where user_id = 100 and second_user_id = 200;

反之亦然:

Select * from user_messages where user_id = 200 and second_user_id = 100;

这种方法将复制数据,但对于Cassandra来说,这是一种支付读取速度的常用方法。

[编辑] 大分区问题

如果您希望每个用户收到太多消息,则应选择另一个分区键,而不是user_id。例如,您可以使用由user_id和day组成的复合分区键,在这种情况下,每个分区仅包含一天的消息,但每天都有单独的分区。此技术通常称为“存储桶”,some example of bucketing

答案 1 :(得分:1)

您可以为两个用户使用反向ID创建两个记录:

记录1:user_id = 1和second_user_id = 2

记录2:user_id = 2和second_user_id = 1

很显然,两条记录必须相同idauthor_idmessage

第二个查询有效

SELECT * FROM sample.user_messages WHERE user_id = 1

而且,无论您在查询中提供的ID的顺序如何,您的第一个查询都可能在所有情况下都有效:

SELECT * FROM sample.user_messages WHERE user_id = 1 AND second_user_id = 2
SELECT * FROM sample.user_messages WHERE user_id = 1 AND second_user_id = 2

两个查询都将提供相同的结果。

答案 2 :(得分:0)

我建议如下对second_user_id使用二级索引:

创建索引index_second_user_id ON sample.user_messages(second_user_id);

所以现在您的第一个查询将保持不变。

您的第二个查询将分成两个单独的查询,分别针对user_id和second_user_id,如下所示

1) select * from "user_messages" where user_id=100;
2) select * from "user_messages" where second_user_id=100;

这应该会有所帮助。