RowReaderFactory vs mapToPair(效率)

时间:2016-03-14 10:57:24

标签: java apache-spark cassandra datastax-enterprise

我想知道Spark中RowReaderFactory和mapToPair之间的效率是多少。

第一种方法:mapToPair

JavaPairRDD<PartitionKey, Dog> dogPairRDD = functions.cassandraTable(keyspace, "dog")
    .mapToPair(new PairFunction<CassandraRow, PartitionKey, Dog>() {
        public Tuple2<PartitionKey, Dog> call(CassandraRow row) throws Exception {
            PartitionKey partitionKey = new PartitionKey(row.getString("string1"), row.getString("string2"));
            Dog dog = new Dog("name","color");
            return new Tuple2<PartitionKey, Dog>(partitionKey, dog);
        }
    });

第二种方法:RowReaderFactory

RowReaderFactory<Dog> dogRRF = new RowReaderDogFactory(); // implements RowReaderFactory<Dog>

JavaPairRDD<PartitionKey, Dog> dogPairRDD = functions.cassandraTable(keyspace, "dog", partitionKeyRRF, dogRRF);

提前致谢:)

0 个答案:

没有答案