我正在将行的数据集映射到自定义类的数据集。
Dataset<Row> rows= sparkSession.read().orc(path);
Dataset<customClass> dataset =
rows.map(I'm parsing row to map it to customClass,
Encoders.bean(customClass.class));
我正在收到此AnalysisException:
AnalysisException:由于数据类型不匹配而无法解析'named_struct()':输入到named_struct函数需要至少一个参数;
我正在使用Spark 2.3.0,并使用javaBeans对自定义类进行编码。
我检查了模式是否由Encoders有效地推断出来,是这种情况。因此,从技术上讲,地图操作应该可以工作。
有人遇到过此异常消息吗? named_struct函数有什么作用?我没有找到与Spark相关的相关信息...
root
|-- field1: struct (nullable = true)
| |-- value: string (nullable = true)
|-- field2: string (nullable = true)
|-- field3: integer (nullable = true)
|-- field4: double (nullable = true)
|-- field5: struct (nullable = true)
| |-- value: double (nullable = true)
|-- field6: struct (nullable = true)
| |-- field61: double (nullable = true)
| |-- field62: string (nullable = true)
| |-- field63: integer (nullable = true)
| |-- field64: struct (nullable = true)
| | |-- value: string (nullable = true)
|-- field7: struct (nullable = true)
| |-- value: double (nullable = true)
|-- field8: struct (nullable = true)
| |-- value: double (nullable = true)
|-- field9: struct (nullable = true)
| |-- field91: map (nullable = true)
| | |-- key: struct
| | |-- value: struct (valueContainsNull = true)
| | | |-- value: string (nullable = true)
| | | |-- field911: struct (nullable = true)
| | | | |-- value: double (nullable = true)
| | | |-- field912: struct (nullable = true)
| | | | |-- value: double (nullable = true)
| | | |-- field913: map (nullable = true)
| | | | |-- key: struct
| | | | |-- value: struct (valueContainsNull = true)
| | | | | |-- value: integer (nullable = false)
| | | | | |-- field9131: struct (nullable = true)
| | | | | | |-- value: double (nullable = true)
| | | | | |-- field9131: struct (nullable = true)
| | | | | | |-- value: double (nullable = true)
| | | |-- field914: struct (nullable = true)
| | | | |-- value: double (nullable = true)
| | | |-- field915: string (nullable = true)
|-- field10: string (nullable = true)
|-- field11: struct (nullable = true)
| |-- field111: map (nullable = true)
| | |-- key: struct
| | |-- value: struct (valueContainsNull = true)
| | | |-- value: integer (nullable = false)
| | | |-- field1111: struct (nullable = true)
| | | | |-- value: double (nullable = true)
| | | |-- field1112: struct (nullable = true)
| | | | |-- value: double (nullable = true)
|-- field12: boolean (nullable = true)
|-- field13: struct (nullable = true)
| |-- field131: integer (nullable = false)
| |-- field132: integer (nullable = false)
|-- field14: struct (nullable = true)
| |-- field141: string (nullable = true)
答案 0 :(得分:0)
我终于找到了为什么出现此named_struct
错误的原因:我正在使用的一个字段被声明为final
,这意味着它没有二传手。这违反了JavaBean
合同。