答案 0 :(得分:0)
您可以使用DataFrameNaFunctions对象从数据集中过滤掉(或替换)NaN值:
示例:
Dataset<Row> yourDataSet = sparkSession.createDataFrame(yourJavaRDDCollection, yourSchema);
Dataset<Row> dfNaNFilter = new DataFrameNaFunctions(yourDataSet);
// If you want to remove all of them:
Dataset<Row> nonNaNValues = dfNaNFilter.drop();
// If you want to replace them with a numeric value (e.g. 104):
Dataset<Row> replacedNaNValues = dfNaNFilter.fill(104);