我想知道Spark中RowReaderFactory和mapToPair之间的效率是多少。
JavaPairRDD<PartitionKey, Dog> dogPairRDD = functions.cassandraTable(keyspace, "dog")
.mapToPair(new PairFunction<CassandraRow, PartitionKey, Dog>() {
public Tuple2<PartitionKey, Dog> call(CassandraRow row) throws Exception {
PartitionKey partitionKey = new PartitionKey(row.getString("string1"), row.getString("string2"));
Dog dog = new Dog("name","color");
return new Tuple2<PartitionKey, Dog>(partitionKey, dog);
}
});
RowReaderFactory<Dog> dogRRF = new RowReaderDogFactory(); // implements RowReaderFactory<Dog>
JavaPairRDD<PartitionKey, Dog> dogPairRDD = functions.cassandraTable(keyspace, "dog", partitionKeyRRF, dogRRF);
提前致谢:)