我有一个数据帧,架构包含一个Array [String]字段:
StructField("user_agent", ArrayType apply (StringType, true))
...
myDataframe.printSchema
(an excerpt)
|-- user_agent: array (nullable = true)
| |-- element: string (containsNull = true)
我正在使用com.databricks.spark.redshift包写入Redshift。我收到一个错误:
java.lang.IllegalArgumentException: Don't know how to save StructField(user_agent,ArrayType(StringType,true),true) to JDBC
at com.databricks.spark.redshift.JDBCWrapper$$anonfun$schemaString$1.apply(RedshiftJDBCWrapper.scala:253)
at com.databricks.spark.redshift.JDBCWrapper$$anonfun$schemaString$1.apply(RedshiftJDBCWrapper.scala:233)
是否可以使用此包将此类数据类型写入Redshift?
答案 0 :(得分:0)
spark-redshift支持以下数据类型:
field.dataType match {
case IntegerType => "INTEGER"
case LongType => "BIGINT"
case DoubleType => "DOUBLE PRECISION"
case FloatType => "REAL"
case ShortType => "INTEGER"
case ByteType => "SMALLINT" // Redshift does not support the BYTE type.
case BooleanType => "BOOLEAN"
case StringType =>
if (field.metadata.contains("maxlength")) {
s"VARCHAR(${field.metadata.getLong("maxlength")})"
} else {
"TEXT"
}
case TimestampType => "TIMESTAMP"
case DateType => "DATE"
case t: DecimalType => s"DECIMAL(${t.precision},${t.scale})"
case _ => throw new IllegalArgumentException(s"Don't know how to save $field to JDBC")
}