有没有办法根据选择查询删除一些记录?
我有这个查询,
await
将显示重复项。我需要获取这些ID并删除它们。我怎么能在spark sql中做到这一点?
答案 0 :(得分:0)
Spark SQL不支持DELETE。
如果要删除的ID数量很少,您可以使用Cassandra驱动程序而不是Spark来执行此操作:
import scala.collection.JavaConverters._
import scala.collection.JavaConversions._
import com.datastax.driver.core.{Cluster, Session, BatchStatement}
import com.datastax.driver.core.querybuilder.QueryBuilder
val cluster = Cluster.builder().addContactPoint(host_ip).build()
val session = cluster.connect(keyspace)
val idsToDelete = ... // perform your query and collect the ids
val queries = idsToDelete.map({ id => QueryBuilder.delete().from(keyspace, table).where(QueryBuilder.eq("id", id)) })
val batch = batchStatement().addAll(queries.asJava)
session.execute(batch)
cluster.close