Spark:Named_struct至少需要一个参数

时间:2018-07-18 23:11:52

标签: java apache-spark apache-spark-sql

我正在将行的数据集映射到自定义类的数据集。

Dataset<Row> rows= sparkSession.read().orc(path);
Dataset<customClass> dataset =
  rows.map(I'm parsing row to map it to customClass, 
    Encoders.bean(customClass.class));

我正在收到此AnalysisException:

  

AnalysisException:由于数据类型不匹配而无法解析'named_struct()':输入到named_struct函数需要至少一个参数;

我正在使用Spark 2.3.0,并使用javaBeans对自定义类进行编码。

我检查了模式是否由Encoders有效地推断出来,是这种情况。因此,从技术上讲,地图操作应该可以工作。

有人遇到过此异常消息吗? named_struct函数有什么作用?我没有找到与Spark相关的相关信息...

root
 |-- field1: struct (nullable = true)
 |    |-- value: string (nullable = true)
 |-- field2: string (nullable = true)
 |-- field3: integer (nullable = true)
 |-- field4: double (nullable = true)
 |-- field5: struct (nullable = true)
 |    |-- value: double (nullable = true)
 |-- field6: struct (nullable = true)
 |    |-- field61: double (nullable = true)
 |    |-- field62: string (nullable = true)
 |    |-- field63: integer (nullable = true)
 |    |-- field64: struct (nullable = true)
 |    |    |-- value: string (nullable = true)
 |-- field7: struct (nullable = true)
 |    |-- value: double (nullable = true)
 |-- field8: struct (nullable = true)
 |    |-- value: double (nullable = true)
 |-- field9: struct (nullable = true)
 |    |-- field91: map (nullable = true)
 |    |    |-- key: struct
 |    |    |-- value: struct (valueContainsNull = true)
 |    |    |    |-- value: string (nullable = true)
 |    |    |    |-- field911: struct (nullable = true)
 |    |    |    |    |-- value: double (nullable = true)
 |    |    |    |-- field912: struct (nullable = true)
 |    |    |    |    |-- value: double (nullable = true)
 |    |    |    |-- field913: map (nullable = true)
 |    |    |    |    |-- key: struct
 |    |    |    |    |-- value: struct (valueContainsNull = true)
 |    |    |    |    |    |-- value: integer (nullable = false)
 |    |    |    |    |    |-- field9131: struct (nullable = true)
 |    |    |    |    |    |    |-- value: double (nullable = true)
 |    |    |    |    |    |-- field9131: struct (nullable = true)
 |    |    |    |    |    |    |-- value: double (nullable = true)
 |    |    |    |-- field914: struct (nullable = true)
 |    |    |    |    |-- value: double (nullable = true)
 |    |    |    |-- field915: string (nullable = true)
 |-- field10: string (nullable = true)
 |-- field11: struct (nullable = true)
 |    |-- field111: map (nullable = true)
 |    |    |-- key: struct
 |    |    |-- value: struct (valueContainsNull = true)
 |    |    |    |-- value: integer (nullable = false)
 |    |    |    |-- field1111: struct (nullable = true)
 |    |    |    |    |-- value: double (nullable = true)
 |    |    |    |-- field1112: struct (nullable = true)
 |    |    |    |    |-- value: double (nullable = true)
 |-- field12: boolean (nullable = true)
 |-- field13: struct (nullable = true)
 |    |-- field131: integer (nullable = false)
 |    |-- field132: integer (nullable = false)
 |-- field14: struct (nullable = true)
 |    |-- field141: string (nullable = true)

1 个答案:

答案 0 :(得分:0)

我终于找到了为什么出现此named_struct错误的原因:我正在使用的一个字段被声明为final,这意味着它没有二传手。这违反了JavaBean合同。