尝试通过尝试映射到数据集中的每一行来尝试从数据集创建RDD。在cannot resolve method map(<lambda expression>)
中的map
上收到df1.map(r ->
错误:
Dataset<Object> df1 = session.read().parquet(tableName).as(Encoders.bean(Object.class));
JavaRDD<List<Tuple3<Long, Integer, Double>>> tempDatas1 = df1.map(r -> new MapFunction<Object, List<Tuple3<Long, Integer, Double>>>(){
//@Override
public List<Tuple3<Long, Integer, Double>> call(Object row) throws Exception
{
// Get the sample property, remove leading and ending spaces and split it by comma
// to get each sample individually
List<Tuple2<String, Integer>> samples = zipWithIndex((row.getSamples().trim().split(",")));
// Gets the unique identifier for that s.
Long snp = row.getPos();
// Calculates the hamming distance.
return samples.stream().map(t -> {
String alleles = t._1();
Integer patient = t._2();
List<String> values = Arrays.asList(alleles.split("\\|"));
Double firstAllele = Double.parseDouble(values.get(0));
Double secondAllele = Double.parseDouble(values.get(1));
// Returns the initial s id, p id and the distance in form of Tuple.
return new Tuple3<>(snp, patient, firstAllele + secondAllele);
}).collect(Collectors.toList());
}
});
任何帮助都将不胜感激。