我正在尝试弄清楚当我使用COPY FROM
从CSV文件加载时丢失数据的原因。这是我的设置:
% cat /tmp/data
2-54-2014,"2014-01-01T01:00:00Z","1588.6960767"
2-54-2014,"2014-01-01T01:10:00Z","1587.64072333"
2-54-2014,"2014-01-01T01:20:00Z","1590.48448448"
2-54-2014,"2014-01-01T01:30:00Z","1590.72830295"
2-54-2014,"2014-01-01T01:40:00Z","1582.58896162"
2-54-2014,"2014-01-01T01:50:00Z","1569.62739561"
2-54-2014,"2014-01-01T02:00:00Z","1560.63714579"
2-54-2014,"2014-01-01T02:10:00Z","1551.97991093"
2-54-2014,"2014-01-01T02:20:00Z","1576.29093944"
2-54-2014,"2014-01-01T02:30:00Z","1584.34574486"
% cqlsh -k hats
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.10 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cqlsh:hats> CREATE TABLE power_1turb (
id TEXT, ts TIMESTAMP, value DOUBLE, PRIMARY KEY ((id), ts));
现在我尝试将数据文件加载到Cassandra中:
cqlsh:hats> COPY power_1turb (id, ts, value) FROM '/tmp/data';
Using 7 child processes
Starting copy of hats.power_1turb with columns [id, ts, value].
Processed: 10 rows; Rate: 16 rows/s; Avg. rate: 24 rows/s
10 rows imported from 1 files in 0.412 seconds (0 skipped).
cqlsh:hats> select * from power_1turb ;
id | ts | value
-----------+---------------------------------+------------
2-54-2014 | 2013-12-31 18:00:00.000000+0000 | 1560.63715
(1 rows)
为什么它只加载1行,为什么它总是在数据中间的同一行?如果我运行一些像insert into power_1turb (id, ts, value) values ('2-54-2014','2014-01-01T01:30:00Z',1590.72830295);
这样的查询,他们会很好地填充数据库。
答案 0 :(得分:4)
定义datetimeformat以及复制命令
因为您的日期时间格式与cqlsh默认日期时间格式不匹配
对于您的情况,请使用以下复制命令:
COPY power_1turb (id, ts, value) FROM 'data' WITH DATETIMEFORMAT = '%Y-%m-%dT%H:%M:%SZ';
使用Cassandra 2.2.5
cassandra@cqlsh:test> SELECT * FROM power_1turb ;
id | ts | value
-----------+--------------------------+------------
2-54-2014 | 2014-01-01 01:00:00+0000 | 1588.69608
2-54-2014 | 2014-01-01 01:10:00+0000 | 1587.64072
2-54-2014 | 2014-01-01 01:20:00+0000 | 1590.48448
2-54-2014 | 2014-01-01 01:30:00+0000 | 1590.7283
2-54-2014 | 2014-01-01 01:40:00+0000 | 1582.58896
2-54-2014 | 2014-01-01 01:50:00+0000 | 1569.6274
2-54-2014 | 2014-01-01 02:00:00+0000 | 1560.63715
2-54-2014 | 2014-01-01 02:10:00+0000 | 1551.97991
2-54-2014 | 2014-01-01 02:20:00+0000 | 1576.29094
2-54-2014 | 2014-01-01 02:30:00+0000 | 1584.34574
(10 rows)
相关文档适用于cassandra 2.2.5 cqlsh
DATETIMEFORMAT,曾经被称为TIMEFORMAT,一个包含日期和时间值的Python strftime格式的字符串,例如'%Y-%m-%d%H:%M:%S%z'。它默认为cqlshrc中的time_format值。