Question

在databricks中，我试图使用JDBC连接器在sql数据仓库中编写一个数据框。在读写过程中，我没有将blob容器用作中间位置。因此，驱动程序和sql-server之间存在直接连接。我使用的代码是：

table = 'SampleTableV2'

df1 = spark.createDataFrame([(1, 'Bilal', 'Shafqat'),
                             (2, 'Ali', 'Azam'),
                             (3, 'Hamdan', 'Sultan'),
                             (4, 'Faizan', 'Pathan'),
                             (5, 'Tehseen', 'Virk'),
                             (6, 'Shahzad', 'Badar')
                            ], mySchema)


# Insert the rows into the Azure SQL table
df1.write \
    .option('user', user) \
    .option('password', pswd) \
    .mode('append') \
    .jdbc('jdbc:sqlserver://' + sqlserver + ':' + port + ';database=' + database, table)

因此，如果我先使用Management Studio或控制台在sql-server中创建一个表，然后在databricks笔记本中运行此命令，则该数据将附加到表中。但是，如果我尝试使用“覆盖”，或者如果我在sql-server中还没有表，并且此命令尝试创建和插入，那么它会给出例外，因为不能将列“ FirstName”用作列存储索引。我还注意到这通常发生在字符串列而不是int列。我也尝试设置tableOptions值，但失败了。

.option("tableOptions","heap,distribution=HASH([Id])")

我们将不胜感激任何帮助。

将数据帧写入sql-server 2017时出现异常：列名具有无法参与列存储索引的数据类型

0 个答案: