将具有多值(集合)属性的CSV导入Cassandra

时间:2013-08-09 05:40:54

标签: csv collections cassandra

假设我想将csv文件导入下表:

CREATE TABLE example_table (
  id int PRIMARY KEY,
  comma_delimited_str_list list<ascii>,
  space_delimited_str_list list<ascii>
);

其中comma_delimited_str_listspace_delimited_str_list是两个list-attributes,分别使用逗号和空格作为分隔符。

示例csv记录将是:

12345,"hello,world","stack overflow"

我希望将"hello,world""stack overflow"视为两个多值属性。

我可以知道如何将这样的CSV文件导入到Cassandra的相应表中吗?最好使用CQL COPY?

1 个答案:

答案 0 :(得分:3)

CQL 1.2能够将具有多值字段的CSV文件直接移植到表中。但是,这些多值字段的格式必须与CQL格式匹配。

例如,列表必须采用['abc','def','ghi']形式,且集合必须采用{'123','456','789'}形式。

以下是将ST格式数据导入OP中提到的example_table的示例:

cqlsh:demo> copy example_table from STDIN;
[Use \. on a line by itself to end input]
[copy] 12345,"['hello','world']","['stack','overflow']"
[copy] 56780,"['this','is','a','test','list']","['here','is','another','one']"
[copy] \.

2 rows imported in 11.304 seconds.
cqlsh:demo> select * from example_table;

 id    | comma_delimited_str_list  | space_delimited_str_list
-------+---------------------------+--------------------------
 12345 |            [hello, world] |        [stack, overflow]
 56780 | [this, is, a, test, list] | [here, is, another, one]

从CSV文件导入错误的格式化列表或设置值将引发错误:

cqlsh:demo> copy example_table from STDIN;
[Use \. on a line by itself to end input]
[copy] 9999,"hello","world"
Bad Request: line 1:108 no viable alternative at input ','
Aborting import at record #0 (line 1). Previously-inserted values still present.

上述输入应替换为9999,"['hello']","['world']"

cqlsh:demo> copy example_table from STDIN;
[Use \. on a line by itself to end input]
[copy] 9999,"['hello']","['world']"
[copy] \.

1 rows imported in 16.859 seconds.
cqlsh:demo> select * from example_table;

 id    | comma_delimited_str_list  | space_delimited_str_list
-------+---------------------------+--------------------------
  9999 |                   [hello] |                  [world]
 12345 |            [hello, world] |        [stack, overflow]
 56780 | [this, is, a, test, list] | [here, is, another, one]