Spark SQL cassandra删除记录

时间:2016-04-28 04:48:48

标签: cassandra apache-spark-sql

有没有办法根据选择查询删除一些记录?

我有这个查询,

await将显示重复项。我需要获取这些ID并删除它们。我怎么能在spark sql中做到这一点?

1 个答案:

答案 0 :(得分:0)

Spark SQL不支持DELETE。

如果要删除的ID数量很少,您可以使用Cassandra驱动程序而不是Spark来执行此操作:

import scala.collection.JavaConverters._
import scala.collection.JavaConversions._
import com.datastax.driver.core.{Cluster, Session, BatchStatement}
import com.datastax.driver.core.querybuilder.QueryBuilder

val cluster = Cluster.builder().addContactPoint(host_ip).build()
val session = cluster.connect(keyspace)

val idsToDelete = ... // perform your query and collect the ids

val queries = idsToDelete.map({ id => QueryBuilder.delete().from(keyspace, table).where(QueryBuilder.eq("id", id)) })
val batch = batchStatement().addAll(queries.asJava)
session.execute(batch)

cluster.close