Question

我正在尝试使用Python Cassandra Driver，Batches和PreparedStatements。我试图插入100k记录，但得到以下错误：

E   ValueError: Batch statement cannot contain more than 65535 statements.

如何将记录分成块并单独插入？我的代码是：

insert_records = session.prepare("INSERT INTO sample_table (key1, key2, key3) VALUES (?, ?, ?)")

batch = BatchStatement(consistency_level=ConsistencyLevel.QUORUM)

no_of_rows = df_100k.shape[0]  # df_100k is a dataframe contains 100k rows

for row in range(no_of_rows):
    batch.add(insert_records, [df_100k.iloc[row]['key1'], df_100k.iloc[row]['key2'], df_100k.iloc[row]['key3']])

session.execute(batch)

UPDATE1 我试过这样的execute_async，

for row in range(no_of_rows):
    session.execute_async("INSERT INTO invoices (key1, key2, key3) VALUES (df_100k.iloc[row]['key1'], df_100k.iloc[row]['key2'], df_100k.iloc[row]['key3']")

但表中没有插入任何行，这里的问题是什么？

UPDATE2 问题是我在使用session.prepare时忘记使用预备语句session.execute_async。

execute_async确实通过Batches解决了我的问题，并显着提升了效果。

批处理语句不能包含超过65535个语句

0 个答案: