我尝试将Dataset<Row>
映射到Java中的Apache Spark中的Dataset<MyRow>
。有谁知道问题出在哪里?
Dataset<Row> flightsDF = spark.read().format("csv").option("header", "true").load( ... );
Dataset<Row> DSOfRows = flightsDF
.filter( ... );
DSOfRows.show(5);
代码将显示5个第一行,然后我将尝试将Row映射到MyRow。
Dataset<MyRow> DSOfMyRows = DSOfRows.map(
(MapFunction<Row, MyRow>) row -> new MyRow(row.getAs("Carrier"),
row.getAs("ArrDelay")), Encoders.bean(MyRow.class));
DSOfMyRows.show(5);
并且存在问题,因为它只会打印空行。
++
||
++
||
||
||
||
||
++
only showing top 5 rows
MyRow类看起来像这样:
public static class MyRow implements Serializable {
public String carrier;
public Double arrDelay;
MyRow(String carrier, Double delay) {
this.carrier = carrier;
this.arrDelay = delay;
}
String getCarrier() {
return carrier;
}
public void setCarrier(String c) {
this.carrier = c;
}
Double getArrDelay() {
return arrDelay;
}
public void setArrDelay(Double delay) {
this.arrDelay = delay;
}
}