Question

我想覆盖SQL表的内容。目前Spark的行为是删除表，然后在我使用overwrite - 模式时创建一个新表。由于限制性权限，我无法做到这一点。当前的解决方案是使用另一个Python包来清理表，然后将数据帧写入SQL-DB。这似乎不对。

有没有办法改变这种行为？

有关使用过的功能的文档，请参阅here。

# Usage example (current state)
# Pre: Clear table.
df.write.jdbc(url=url, table="baz", mode='overwrite, properties=properties)

Answer 1

我刚遇到了类似的问题，事实证明，自Spark 2.1.0起就可以截断JDBC表。

要启用截断，需要设置mode='overwrite'并在属性中添加其他键：

properties['truncate'] = 'true'
df.write.jdbc(url=url, table="baz", mode='overwrite', properties=properties)