是否可以使用Dataset<Row>
表中的Java 2.1生成直方图数据帧?
答案 0 :(得分:1)
示例: 我有一个火花的表格,表格名为'country',列为'n_nationkey',这是整数,那么我就是这样做的:
String query = "select n_nationkey from nation" ;
Dataset<Row> df = spark.sql(query);
JavaRDD<Integer> jdf = df.toJavaRDD().map(row -> row.getInt(0));
JavaDoubleRDD example = jdf.mapToDouble(y -> y);
Tuple2<double[], long[]> resultsnew = example.histogram(5);
如果列具有double类型,您只需将以下内容替换为:
JavaRDD<Double> jdf = df.toJavaRDD().map(row -> row.getDouble(0));
JavaDoubleRDD example = jdf.mapToDouble(y -> y);
Tuple2<double[], long[]> resultsnew = example.histogram(5);