在CQL上使用IN运算符分区键和(聚类键或索引列)

时间:2015-09-30 18:19:20

标签: cassandra cql cassandra-2.0 cql3

我有一个警报表。我想在2列上使用IN运算符并在一列上使用大于运算符来查询它。我没试过下面的事情。有人可以告诉我DB设计使查询工作吗? 我的环境细节:[cqlsh 5.0.1 | Cassandra 2.1.2 | CQL规范3.2.0 |原生协议v3]

在分区键中使用'type':

CREATE TABLE alerts (
    serialNumber text,
    time bigint,
    type text,
    time2 int,
    status text,
    parentId int,
    PRIMARY KEY ((serialNumber,type), time)
 ) WITH CLUSTERING ORDER BY (time DESC);

cqlsh:testdb> select * from alerts WHERE serialNumber IN ( '1','2') AND type IN ( '1','2','3' ) AND time > 1;
code=2200 [Invalid query] message="Partition KEY part serialNumber cannot be restricted by IN relation (only the last part of the partition key can)"

在群集键中使用'type':

CREATE TABLE alerts (
    serialNumber text,
    time bigint,
    type text,
    time2 int,
    status text,
    parentId int,
    PRIMARY KEY (serialNumber, type, time)
) WITH CLUSTERING ORDER BY (type ASC,time DESC);

cqlsh:testdb> select * from alerts WHERE serialnumber IN ( '1','2') AND type IN ( 'a','b') and time > 1;
code=2200 [Invalid query] message="Clustering column "type" cannot be restricted by an IN relation"

索引类型:

CREATE TABLE alerts (
    serialNumber text,
    time bigint,
    type text,
    time2 int,
    status text,
    parentId int,
    PRIMARY KEY (serialNumber, time)
) WITH CLUSTERING ORDER BY (time DESC);
CREATE INDEX alertsTypeIndex ON alerts(type);

select * from alerts WHERE serialnumber IN ( '1','2') and time > 1 AND type IN ( 'a','b');
code=2200 [Invalid query] message="IN predicates on non-primary-key columns (type) is not yet supported"

select * from alerts WHERE serialnumber IN ( '1','2') and time > 1 AND type = 'a';
code=2200 [Invalid query] message="Select on indexed columns and with IN clause for the PRIMARY KEY are not supported"

2 个答案:

答案 0 :(得分:2)

你在哪个版本?你的第二个例子在2.2.1中为我工作:

Connected to VaporTrails at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 2.2.1 | CQL spec 3.3.0 | Native protocol v4]
Use HELP for help.
aploetz@cqlsh> use stackoverflow ;
aploetz@cqlsh:stackoverflow> CREATE TABLE alerts (
                 ...     serialNumber text,
                 ...     time bigint,
                 ...     type text,
                 ...     time2 int,
                 ...     status text,
                 ...     parentId int,
                 ...     PRIMARY KEY (serialNumber, type, time)
                 ... ) WITH CLUSTERING ORDER BY (type ASC,time DESC);
aploetz@cqlsh:stackoverflow> INSERT INTO alerts (serialnumber , time,type,time2, status, parentid) VALUES ('1',0,'1',1,'1',1);
aploetz@cqlsh:stackoverflow> INSERT INTO alerts (serialnumber , time,type,time2, status, parentid) VALUES ('2',2,'2',2,'2',2);
aploetz@cqlsh:stackoverflow> INSERT INTO alerts (serialnumber , time,type,time2, status, parentid) VALUES ('3',3,'3',3,'3',3);
aploetz@cqlsh:stackoverflow> select * from alerts WHERE serialNumber IN ( '1','2') AND type IN ( '1','2','3' ) AND time > 1;

 serialnumber | type | time | parentid | status | time2
--------------+------+------+----------+--------+-------
            2 |    2 |    2 |        2 |      2 |     2

(1 rows)

即使你可以让它发挥作用,我也会建议反对它。在分区键上使用IN关键字称为"多键"查询反模式。 DataStax有一个blurb in their documentation discussing why this is bad。从本质上讲,这种方法不会扩展,因为必须查询许多节点才能满足这些类型的查询。您应该找到另一种方法来建模警报,以便您不需要使用IN关系。

答案 1 :(得分:0)

我刚刚遇到了一个类似的问题,我想在Cassandra列上的in运算符中使用它,但想确保它不会像@Aaron所述那样坏。

查看文档,其中有很多解释:https://docs.datastax.com/en/archived/cql/3.1/cql/cql_reference/select_r.html#reference_ds_d35_v2q_xj__selectIN

TLDR:

  

仅当您查询键的所有先前列是否相等时,才建议在分区键的最后一列使用IN条件

CREATE TABLE parts (part_type text, part_name text, part_num int, part_year text, serial_num text, PRIMARY KEY ((part_type, part_name), part_num, part_year));

SELECT * FROM parts WHERE part_type='alloy' AND part_name='hubcap' AND part_num=1249 AND part_year IN ('2010', '2015');

有关更多示例和问题,请转到上面的链接。