这是我的架构
root
|-- DataPartition: string (nullable = true)
|-- TimeStamp: string (nullable = true)
|-- PeriodId: long (nullable = true)
|-- FinancialAsReportedLineItemName: struct (nullable = true)
| |-- _VALUE: string (nullable = true)
| |-- _languageId: long (nullable = true)
|-- FinancialLineItemSource: long (nullable = true)
|-- FinancialStatementLineItemSequence: long (nullable = true)
|-- FinancialStatementLineItemValue: double (nullable = true)
|-- FiscalYear: long (nullable = true)
|-- IsAnnual: boolean (nullable = true)
|-- IsAsReportedCurrencySetManually: boolean (nullable = true)
|-- IsCombinedItem: boolean (nullable = true)
|-- IsDerived: boolean (nullable = true)
|-- IsExcludedFromStandardization: boolean (nullable = true)
|-- IsFinal: boolean (nullable = true)
|-- IsTotal: boolean (nullable = true)
|-- ParentLineItemId: long (nullable = true)
|-- PeriodPermId: struct (nullable = true)
| |-- _VALUE: long (nullable = true)
| |-- _objectTypeId: long (nullable = true)
|-- ReportedCurrencyId: long (nullable = true)
从上面的架构我想尝试这样做
val temp = tempNew1
.withColumn("FinancialAsReportedLineItemName", $"FinancialAsReportedLineItemName._VALUE")
.withColumn("FinancialAsReportedLineItemName_languageId", $"FinancialAsReportedLineItemName._languageId")
.withColumn("PeriodPermId", $"PeriodPermId._VALUE")
.withColumn("PeriodPermId_objectTypeId", $"PeriodPermId._objectTypeId").drop($"AsReportedItem").drop($"AsReportedItem")
我不知道我在这里缺少什么。 我得到以下错误
线程“main”中的异常org.apache.spark.sql.AnalysisException: 无法从FinancialAsReportedLineItemName#2262中提取值:需要 struct type但是得到了字符串;
答案 0 :(得分:2)
问题是,当FinancialAsReportedLineItemName._languageId
列替换为FinancialAsReportedLineItemName
FinancialAsReportedLineItemName._VALUE
你应该改变以下两行
.withColumn("FinancialAsReportedLineItemName", $"FinancialAsReportedLineItemName._VALUE")
.withColumn("FinancialAsReportedLineItemName_languageId", $"FinancialAsReportedLineItemName._languageId")
到
.withColumn("FinancialAsReportedLineItemName_value", $"FinancialAsReportedLineItemName._VALUE")
.withColumn("FinancialAsReportedLineItemName_languageId", $"FinancialAsReportedLineItemName._languageId")
如果FinancialAsReportedLineItemName_value
列名称应为FinancialAsReportedLineItemName
,那么您应该将withColumns
替换为
.withColumn("FinancialAsReportedLineItemName_languageId", $"FinancialAsReportedLineItemName._languageId")
.withColumn("FinancialAsReportedLineItemName", $"FinancialAsReportedLineItemName._VALUE")