Question

尝试使用Simba驱动程序将数据帧写入Bigquery时。正在获取以下异常。以下是数据框。在bigquery中使用相同的架构创建了一个表。

df.printSchema
root
 |-- empid: integer (nullable = true)
 |-- firstname: string (nullable = true)
 |-- middle: string (nullable = true)
 |-- last: string (nullable = true)
 |-- gender: string (nullable = true)
 |-- age: double (nullable = true)
 |-- weight: integer (nullable = true)
 |-- salary: integer (nullable = true)
 |-- city: string (nullable = true)

Simba驱动程序抛出以下错误

 Caused by: com.simba.googlebigquery.support.exceptions.GeneralException: [Simba][BigQueryJDBCDriver](100032) Error executing query job. Message: 400 Bad Request
    {
      "code" : 400,
      "errors" : [ {
        "domain" : "global",
        "location" : "q",
        "locationType" : "parameter",
        "message" : "Syntax error: Unexpected string literal \"empid\" at [1:38]",
        "reason" : "invalidQuery"
      } ],
      "message" : "Syntax error: Unexpected string literal \"empid\" at [1:38]",
      "status" : "INVALID_ARGUMENT"
    }
      ... 24 more

以下是用于同一代码的代码：

val url = "jdbc:bigquery://https://www.googleapis.com/bigquery/v2;ProjectId=my_project_id;OAuthType=0;OAuthPvtKeyPath=service_account_jsonfile;OAuthServiceAcctEmail=googleaccount"
df.write.mode(SaveMode.Append).jdbc(url,"orders_dataset.employee",new java.util.Properties)

请让我知道是否缺少任何其他配置或出了什么问题。预先感谢！

Answer 1

行为似乎是由Spark引起的，它在column names附近发送了额外的配额。

要在Spark中解决此问题，您需要在创建Spark上下文之后并在创建数据框之前添加以下代码：

JdbcDialects.registerDialect(new JdbcDialect() {

override def canHandle(url: String): Boolean = url.toLowerCase.startsWith("jdbc:bigquery:")

override

def quoteIdentifier(column: String): String =  column

})

使用simba驱动程序将数据框触发到Bigquery

1 个答案: