我试图从Spark数据框写入Cassandra。当我有一个简单的数据帧模式时,如示例所示,它可以工作:
root
|-- id: string (nullable = true)
|-- url: string (nullable = true)
但是,当我尝试编写包含StructTypes的数据框时,使用如下模式:
root
|-- crawl: struct (nullable = true)
| |-- id: string (nullable = true)
然后我得到以下异常:
Exception in thread "main" java.lang.IllegalArgumentException: Unsupported type: StructType(StructField(id,StringType,true))
at com.datastax.spark.connector.types.ColumnType$.unsupportedType$1(ColumnType.scala:132)
at com.datastax.spark.connector.types.ColumnType$.fromSparkSqlType(ColumnType.scala:155)
at com.datastax.spark.connector.mapper.DataFrameColumnMapper$$anonfun$1.apply(DataFrameColumnMapper.scala:18)
at com.datastax.spark.connector.mapper.DataFrameColumnMapper$$anonfun$1.apply(DataFrameColumnMapper.scala:16)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at com.datastax.spark.connector.mapper.DataFrameColumnMapper.newTable(DataFrameColumnMapper.scala:16)
at com.datastax.spark.connector.cql.TableDef$.fromDataFrame(Schema.scala:215)
at com.datastax.spark.connector.DataFrameFunctions.createCassandraTable(DataFrameFunctions.scala:26)
我的代码如下所示:
val df = sqlContext.read.parquet(input)
df.createCassandraTable(keyspace, table)
df.write
.format("org.apache.spark.sql.cassandra")
.options(Map("table" -> table, "keyspace" -> keyspace))
.save()
帮助?
答案 0 :(得分:0)
看起来连接器当前不支持从DataFrame Structs动态创建UDT类型。您应该将Spark票证添加到Spark Cassandra Connector Jira作为功能请求。在此之前,您可以随时手动创建一个与您的结构类型匹配的新类型。