当预期100k时,Cassandra只插入一行

时间:2016-06-14 13:29:22

标签: python-3.x cassandra cql

我尝试CQL Python driver插入100k行,

# no_of_rows = 100k
for row in range(no_of_rows):
    session.execute("INSERT INTO test_table (key1, key2, key3) VALUES ('test', 'test', 'test'"))

但只有一行插入test_table(使用Cassandra CQL Shellselect * from test_table),如何解决问题?

更新

如果我试过

for row in range(no_of_rows):
    session.execute("INSERT INTO test_table (key1, key2, key3) VALUES ('test' + str(row), 'test', 'test'"))

没有插入任何行,此处key1是主键。

describe test_table

CREATE TABLE test_keyspace.test_table (
key1 text PRIMARY KEY,
key2 text,
key3 text
) WITH bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';

2 个答案:

答案 0 :(得分:4)

Cassandra主键是独一无二的。 100000就地写入相同的密钥会留下1行。

这意味着,如果您的主键结构为PRIMARY KEY(key1,key2,key3),并且您插入'test','test','test' 100000次......

...它将'test','test','test'写入同一分区100000次。

为了使Python代码有效,我做了一些调整,例如为key(key1)创建一个单独的变量并使用预准备语句:

pStatement = session.prepare("""
    INSERT INTO test_table (key1, key2, key3) VALUES (?, ?, ?);
""")

no_of_rows=100000

for row in range(no_of_rows):
    key='test' + str(row)
    session.execute(pStatement,[key,'test','test'])
  

使用Cassandra CQL Shell和select * from test_table

我不得不提到,多键(一次查询多个分区键)和未绑定查询(没有WHERE子句的SELECT)在Cassandra中确定反模式。它们在开发/测试环境中似乎可以正常工作。但是当你到达具有数十个节点的生产规模集群时,这些类型的查询会在方程式中引入大量网络时间,因为他们必须扫描每个节点以编译查询结果。

答案 1 :(得分:1)

您的新代码在字符串连接中存在错误。它应该是:

for row in range(no_of_rows):
    session.execute("INSERT INTO test_table (key1, key2, key3) VALUES ('test" + str(row) + "', 'test', 'test')")