Scala-Spark展平嵌套架构在xml文件中包含多个数组和结构类型

时间:2018-11-16 10:57:29

标签: scala apache-spark

我具有如下所示的数据框架构: 我无法将其展平为正确的格式。我需要一些帮助来简化复杂的架构。 我尝试使用某些功能,但出现以下错误: org.apache.spark.sql.AnalysisException: Only one generator allowed per select clause but found 2: generatorouter(explode(RetailTransaction.LineItem AS LineItem )), generatorouter(explode(RetailTransaction.Total AS总计));

我能够展平到结构类型,但是在执行数组类型时却遇到了上述问题。

 `> root
 |-- BusinessDayDate: string (nullable = true)
 |-- ControlTransaction: struct (nullable = true)
 |    |-- OperatorSignOff: struct (nullable = true)
 |    |    |-- CloseBusinessDayDate: string (nullable = true)
 |    |    |-- CloseTransactionSequenceNumber: long (nullable = true)
 |    |    |-- EndDateTimestamp: string (nullable = true)
 |    |    |-- OpenBusinessDayDate: string (nullable = true)
 |    |    |-- OpenTransactionSequenceNumber: long (nullable = true)
 |    |    |-- StartDateTimestamp: string (nullable = true)
 |    |-- ReasonCode: string (nullable = true)
 |    |-- _Version: double (nullable = true)
 |-- CurrencyCode: string (nullable = true)
 |-- EndDateTime: string (nullable = true)
 |-- OperatorID: struct (nullable = true)
 |    |-- _OperatorName: string (nullable = true)
 |    |-- _VALUE: long (nullable = true)
 |-- RetailStoreID: long (nullable = true)
 |-- RetailTransaction: struct (nullable = true)
 |    |-- ItemCount: long (nullable = true)
 |    |-- LineItem: array (nullable = true)
 |    |    |-- element: struct (containsNull = true)
 |    |    |    |-- Sale: struct (nullable = true)
 |    |    |    |    |-- Description: string (nullable = true)
 |    |    |    |    |-- DiscountAmount: double (nullable = true)
 |    |    |    |    |-- ExtendedAmount: double (nullable = true)
 |    |    |    |    |-- ExtendedDiscountAmount: double (nullable = true)
 |    |    |    |    |-- ItemID: long (nullable = true)
 |    |    |    |    |-- Itemizers: struct (nullable = true)
 |    |    |    |    |    |-- _FoodStampable: boolean (nullable = true)
 |    |    |    |    |    |-- _Itemizer6: boolean (nullable = true)
 |    |    |    |    |    |-- _Itemizer8: boolean (nullable = true)
 |    |    |    |    |    |-- _Tax1: boolean (nullable = true)
 |    |    |    |    |    |-- _VALUE: long (nullable = true)
 |    |    |    |    |-- MerchandiseHierarchy: struct (nullable = true)
 |    |    |    |    |    |-- _DepartmentDescription: string (nullable = true)
 |    |    |    |    |    |-- _Level: string (nullable = true)
 |    |    |    |    |    |-- _VALUE: long (nullable = true)
 |    |    |    |    |-- OperatorSequence: long (nullable = true)
 |    |    |    |    |-- POSIdentity: struct (nullable = true)
 |    |    |    |    |    |-- POSItemID: long (nullable = true)
 |    |    |    |    |    |-- Qualifier: long (nullable = true)
 |    |    |    |    |    |-- _POSIDType: string (nullable = true)
 |    |    |    |    |-- Quantity: double (nullable = true)
 |    |    |    |    |-- RegularSalesUnitPrice: double (nullable = true)
 |    |    |    |    |-- ReportCode: long (nullable = true)
 |    |    |    |    |-- _ItemType: string (nullable = true)
 |    |    |    |-- SequenceNumber: long (nullable = true)
 |    |    |    |-- Tax: struct (nullable = true)
 |    |    |    |    |-- Amount: double (nullable = true)
 |    |    |    |    |-- Percent: double (nullable = true)
 |    |    |    |    |-- Reason: string (nullable = true)
 |    |    |    |    |-- TaxableAmount: double (nullable = true)
 |    |    |    |    |-- _TaxDescription: string (nullable = true)
 |    |    |    |    |-- _TaxID: long (nullable = true)
 |    |    |    |-- Tender: struct (nullable = true)
 |    |    |    |    |-- Amount: double (nullable = true)
 |    |    |    |    |-- Authorization: struct (nullable = true)
 |    |    |    |    |    |-- AuthorizationCode: string (nullable = true)
 |    |    |    |    |    |-- AuthorizationDateTime: string (nullable = true)
 |    |    |    |    |    |-- ReferenceNumber: long (nullable = true)
 |    |    |    |    |    |-- RequestedAmount: double (nullable = true)
 |    |    |    |    |    |-- _ElectronicSignature: boolean (nullable = true)
 |    |    |    |    |    |-- _HostAuthorized: boolean (nullable = true)
 |    |    |    |    |-- OperatorSequence: long (nullable = true)
 |    |    |    |    |-- TenderID: long (nullable = true)
 |    |    |    |    |-- _TenderDescription: string (nullable = true)
 |    |    |    |    |-- _TenderType: string (nullable = true)
 |    |    |    |    |-- _TypeCode: string (nullable = true)
 |    |    |    |-- _EntryMethod: string (nullable = true)
 |    |    |    |-- _weightItem: boolean (nullable = true)
 |    |-- PerformanceMetrics: struct (nullable = true)
 |    |    |-- IdleTime: long (nullable = true)
 |    |    |-- RingTime: long (nullable = true)
 |    |    |-- TenderTime: long (nullable = true)
 |    |-- ReceiptDateTime: string (nullable = true)
 |    |-- Total: array (nullable = true)
 |    |    |-- element: struct (containsNull = true)
 |    |    |    |-- _TotalType: string (nullable = true)
 |    |    |    |-- _VALUE: double (nullable = true)
 |    |-- TransactionCount: long (nullable = true)
 |    |-- _Version: double (nullable = true)
 |-- SequenceNumber: long (nullable = true)
 |-- WorkstationID: long (nullable = true)`

预先感谢 Seetharam

0 个答案:

没有答案