Cassandra Schema的聊天应用程序

时间:2014-06-12 05:14:30

标签: cassandra cql

我已经离开了这个article,这是我从中得到的架构。这有助于我的应用程序维护用户的状态,但是如何扩展它以维护一对一的聊天存档和用户之间的关系,关系意味着人们属于我的特定组。我是新手,需要一种方法。

要求:

  • 我想在表中的用户 - 用户之间存储消息。
  • 每当用户想要由用户加载消息时。我想将它们检索回来并发送给用户。
  • 我想在用户请求时检索来自不同用户的所有消息。
  • 并且还希望存储用户类。我的意思是例如user1和user2属于“family”user3,user4,user1属于朋友等...这个组可以是用户给出的自定义名称。

这是我到目前为止所尝试的:

CREATE TABLE chatarchive (
   chat_id uuid PRIMARY KEY,
   username text,
   body text
)

CREATE TABLE chatseries (
username text,
    time timeuuid,
    chat_id uuid,
    PRIMARY KEY (username, time)
) WITH CLUSTERING ORDER BY (time ASC)

CREATE TABLE chattimeline (
    to text,
username text,
    time timeuuid,
    chat_id uuid,
    PRIMARY KEY (username, time)
) WITH CLUSTERING ORDER BY (time ASC)

以下是我目前拥有的架构:

CREATE TABLE users (
   username text PRIMARY KEY,
   password text
)    

CREATE TABLE friends (
    username text,
    friend text,
    since timestamp,
    PRIMARY KEY (username, friend)
)

CREATE TABLE followers (
    username text,
    follower text,
    since timestamp,
    PRIMARY KEY (username, follower)
)

CREATE TABLE tweets (
    tweet_id uuid PRIMARY KEY,
    username text,
    body text
)
CREATE TABLE userline (
    username text,
    time timeuuid,
    tweet_id uuid,
    PRIMARY KEY (username, time)
) WITH CLUSTERING ORDER BY (time DESC)

CREATE TABLE timeline (
    username text,
    time timeuuid,
    tweet_id uuid,
    PRIMARY KEY (username, time)
) WITH CLUSTERING ORDER BY (time DESC)

2 个答案:

答案 0 :(得分:11)

使用C *,您需要以您使用它的方式存储数据。 那么让我们看看这种情况会是怎样的:

  • 我想在表中的用户 - 用户之间存储消息。
  • 每当用户想要由用户加载消息时。我想要将它们检索回来并发送给用户。

    CREATE TABLE chat_messages ( message_id uuid, from_user text, to_user text, body text, class text, time timeuuid, PRIMARY KEY ((from_user, to_user), time) ) WITH CLUSTERING ORDER BY (time ASC);

这将允许您检索两个用户之间的消息时间轴。请注意,使用复合主键,以便为每对用户创建宽行。

SELECT * FROM chat_messages WHERE from_user = 'mike' AND to_user = 'john' ORDER BY time DESC ;

  • 我想在用户请求时检索来自不同用户的所有消息。

CREATE INDEX chat_messages_to_user ON chat_messages (to_user);

这允许你这样做:

SELECT * FROM chat_messages WHERE to_user = 'john';
  • 并且还希望存储用户类。我的意思是例如user1和user2属于“family”user3,user4,user1属于朋友等...这个组可以是用户给出的自定义名称。

CREATE INDEX chat_messages_class ON chat_messages (class);

这样你就可以:

SELECT * FROM chat_messages WHERE class = 'family';

请注意,在这种数据库中,异常数据是一种很好的做法。这意味着一次又一次地使用该类的名称并不是一种不好的做法。

另请注意,我没有使用'chat_id'或'聊天'表。我们可以很容易地添加这个,但我觉得你的用例并不需要它,因为它已被提出。通常,您不能在C *中进行连接。因此,使用聊天ID会暗示两个查询。

编辑:二级索引效率低下。物化视图将是使用C * 3.0

的更好实现

答案 1 :(得分:5)

Alan Chandler在github上创建了一个聊天应用程序,其中包含您要求的功能:

它使用 2阶段身份验证。首先,在论坛中验证用户,然后在聊天数据库上验证用户。

此处是架构的第一个验证部分(位于inc/user.sql中的架构):

BEGIN;

CREATE TABLE users (
  uid integer primary key autoincrement NOT NULL,
  time bigint DEFAULT (strftime('%s','now')) NOT NULL,
  name character varying NOT NULL,
  role text NOT NULL DEFAULT 'R',      -- A (CEO), L (DIRECTOR), G (DEPT HEAD), H (SPONSOR) R(REGULAR)
  cap integer DEFAULT 0 NOT NULL,      -- 1 = blind, 2 = committee secretary, 4 = admin, 8 = mod, 16 = speaker 32 = can't whisper( OR of capabilities).
  password character varying NOT NULL, -- raw password
  rooms character varying,             -- a ":" separated list of rooms nos which define which rooms the user can go in
  isguest boolean DEFAULT 0 NOT NULL
);
CREATE INDEX userindex ON users(name);
-- Below here you can add the specific users for your set up in the form of INSERT Statements

-- This list is test users to cover the complete range of functions. Note names are converted to lowercase, so only put lowercase names in here
INSERT INTO users(uid,name,role,cap,password,rooms,isguest) VALUES
(1,'alice','A',4,'password','7',0),     -- CEO class user alice
(2,'bob','L',3,'password','8',0),       -- DIRECTOR class user bob 
(3,'carol','G',2,'password','7:8:9',0), -- DEPT HEAD class user carol

这是模式的第二个验证部分(位于data/chat.sql中的模式):

CREATE TABLE users (
  uid integer primary key NOT NULL,
  time bigint DEFAULT (strftime('%s','now')) NOT NULL,
  name character varying NOT NULL,
  role char(1) NOT NULL default 'R',
  rid integer NOT NULL default 0,
  mod char(1) NOT NULL default 'N',
  question character varying,
  private integer NOT NULL default 0,
  cap integer NOT NULL default 0,
  rooms character_varying 
);

以下是聊天会议室的架构,您可以看到用户班级及其示例:

CREATE TABLE rooms (
  rid integer primary key NOT NULL,
  name varchar(30) NOT NULL,
  type integer NOT NULL -- 0 = Open, 1 = meeting, 2 = guests can't speak, 3 moderated, 4 members(adult) only, 5 guests(child) only, 6 creaky door
) ;

INSERT INTO rooms (rid, name, type) VALUES 
(1, 'The Forum', 0),
(2, 'Operations Gallery', 2),  -- Guests Can't Speak
(3, 'Dungeon Club', 6),        -- creaky door
(4, 'Auditorium', 3),          -- Moderated Room
(5, 'Blue Room', 4),           -- Members Only (in Melinda's Backups this is Adults)
(6, 'Green Room', 5),          -- Guest Only (in Melinda's Backups this is Juveniles AKA Baby Backups)
(7, 'The Board Room', 1),      -- Various meeting rooms - need to be on users room list

用户有另一张表格,表明对话的参与

CREATE table wid_sequence ( value integer);
INSERT INTO wid_sequence (value) VALUES (1);

CREATE TABLE participant (
  uid integer NOT NULL REFERENCES users (uid) ON DELETE CASCADE ON UPDATE CASCADE,
  wid integer NOT NULL,
  primary key (uid,wid)
);

<
档案记录如下:

CREATE TABLE chat_log (
  lid integer primary key,
  time bigint DEFAULT (strftime('%s','now')) NOT NULL,
  uid integer NOT NULL REFERENCES user (uid) ON DELETE CASCADE ON UPDATE CASCADE,
  name character varying NOT NULL,
  role char(1) NOT NULL,
  rid integer NOT NULL,
  type char(2) NOT NULL,
  text character varying
);

编辑但是,此类数据建模不太适合Cassandra。因为,在Cassandra中,您的数据不适合一台计算机,因此无法使用联接。因此,Cassandra denormalizing数据是实用的选择。请在下面查看chat_log表的非规范化版本:

CREATE TABLE chat_log (
  lid uuid,
  time timestamp,
  sender text NOT NULL,
  receiver text NOT NULL,
  room text NOT NULL,
  sender_role varchar NOT NULL,
  receiver_role varchar NOT NULL,
  rid decimal NOT NULL,
  status varchar NOT NULL,
  message text,
  PRIMARY KEY (sender, receiver, room)
  -- PRIMARY KEY (sender, receiver) if you don't want the messages to be separated by the rooms
) WITH CLUSTERING ORDER BY (time ASC);

现在,为了检索数据,您需要使用以下查询:

用户想要加载用户的消息。我想要将它们检索回来并发送给用户。

SELECT * FROM chat_log WHERE sender = 'bob' ORDER BY time ASC

我希望在用户请求时检索来自不同用户的所有邮件

SELECT * FROM chat_log WHERE receiver = 'alice' ORDER BY time ASC

我想存储和检索用户类

SELECT * FROM chat_log WHERE sender_role = 'A' ORDER BY time ASC -- messages sent by CEOs

SELECT * FROM chat_log WHERE receiver_role = 'A' ORDER BY time ASC -- messages received by CEOs


对数据建模后。您需要创建索引以进行快速有效的查询,如下所示:

  • 有效检索来自不同用户的所有消息
  

CREATE INDEX chat_log_uid on chat_log(sender);
  CREATE INDEX chat_log_uid ON chat_log(receiver);

  • 用于有效检索用户的所有消息
  

CREATE INDEX chat_log_class ON chat_log(sender_role);
  CREATE INDEX chat_log_class ON chat_log(receiver_role);


我相信这些例子会为您提供所需的方法

如果您想了解有关Cassandra数据建模的更多信息,请查看以下内容: