我正在尝试使用Python Cassandra Driver
,Batches
和PreparedStatements
。我试图插入100k记录,但得到以下错误:
E ValueError: Batch statement cannot contain more than 65535 statements.
如何将记录分成块并单独插入?我的代码是:
insert_records = session.prepare("INSERT INTO sample_table (key1, key2, key3) VALUES (?, ?, ?)")
batch = BatchStatement(consistency_level=ConsistencyLevel.QUORUM)
no_of_rows = df_100k.shape[0] # df_100k is a dataframe contains 100k rows
for row in range(no_of_rows):
batch.add(insert_records, [df_100k.iloc[row]['key1'], df_100k.iloc[row]['key2'], df_100k.iloc[row]['key3']])
session.execute(batch)
UPDATE1
我试过这样的execute_async
,
for row in range(no_of_rows):
session.execute_async("INSERT INTO invoices (key1, key2, key3) VALUES (df_100k.iloc[row]['key1'], df_100k.iloc[row]['key2'], df_100k.iloc[row]['key3']")
但表中没有插入任何行,这里的问题是什么?
UPDATE2
问题是我在使用session.prepare
时忘记使用预备语句session.execute_async
。
execute_async
确实通过Batches
解决了我的问题,并显着提升了效果。