尝试查询嵌套列时,Spark Data框架抛出错误

时间:2017-03-26 09:44:54

标签: scala apache-spark apache-spark-sql spark-dataframe

我正在使用具有以下架构的Dataframe。这基本上是一个XML文件,我已将其转换为Dataframe以进行进一步处理。我尝试提取 _Date 列,但看起来发生了某种类型不匹配的情况

dplyr

我需要提取 _Date 列,但是它会抛出以下错误

df1.printSchema

 |-- PlayWeek: struct (nullable = true)
 |    |-- TicketSales: array (nullable = true)
 |    |    |-- element: struct (containsNull = true)
 |    |    |    |-- PlayDate: array (nullable = true)
 |    |    |    |    |-- element: struct (containsNull = true)
 |    |    |    |    |    |-- BoxOfficeDetail: array (nullable = true)
 |    |    |    |    |    |    |-- element: struct (containsNull = true)
 |    |    |    |    |    |    |    |-- VisualFormatCd: struct (nullable = true)
 |    |    |    |    |    |    |    |    |-- Code: struct (nullable = true)
 |    |    |    |    |    |    |    |    |    |-- _SequenceId: long (nullable = true)
 |    |    |    |    |    |    |    |    |    |-- _VALUE: double (nullable = true)
 |    |    |    |    |    |    |    |-- _SessionTypeCd: string (nullable = true)
 |    |    |    |    |    |    |    |-- _TicketPrice: double (nullable = true)
 |    |    |    |    |    |    |    |-- _TicketQuantity: long (nullable = true)
 |    |    |    |    |    |    |    |-- _TicketTax: double (nullable = true)
 |    |    |    |    |    |    |    |-- _TicketTypeCd: string (nullable = true)
 |    |    |    |    |    |-- _Date: string (nullable = true)
 |    |    |    |-- _FilmId: long (nullable = true)
 |    |    |    |-- _Screen: long (nullable = true)
 |    |    |    |-- _TheatreId: long (nullable = true)
 |    |-- _BusinessEndDate: string (nullable = true)
 |    |-- _BusinessStartDate: string (nullable = true)

任何帮助都将不胜感激。

0 个答案:

没有答案