Spark与Cassandra SaveToCassandra

时间:2014-12-13 12:08:54

标签: java apache-spark cassandra-2.0

我正在使用Cassandra和spark,我有一个JavaRDD<Srring>包含6个以,分隔的列,如下所示:

header: canal,action,time,tiemend,client

我创建了一个包含6列的表Mytable

CREATE TABLE IF NOT EXISTS ref_event_by_user_session (
    canal       TEXT,
    action      TEXT,
    time        timestamp,
    timeend     timestamp,
    Client      INT,
    PRIMARY KEY(canal, action, time, timeend)
);

现在我想使用JavaRDD将我的javaFunctions().saveToCassandra保存在我的Cassandra表中, 但我不知道如何使用它。你能告诉我怎么做吗?

1 个答案:

答案 0 :(得分:2)

我找到了解决方案。 如果它可以帮助某人:) 我只是用table2

创建一个java类canal String, action String, time Date, timeend Date, Client String,

然后:

JavaRDD<Table2> table2 = session_7.map(new Function<Tuple2<KeyTable2, ValueTable2>, Table2>() {
       @Override
       public Table2 call(Tuple2<KeyTable2, ValueTable2> keyTable2ValueTable2Tuple2) throws Exception {
           SimpleDateFormat formatter = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSS");
           Date startdate = formatter.parse(keyTable2ValueTable2Tuple2._1.time.replace("+","."));
           Date endtdate = formatter.parse(keyTable2ValueTable2Tuple2._1.endtime.replace("+","."));
           Table2 t = new Table2(keyTable2ValueTable2Tuple2._2.canal,keyTable2ValueTable2Tuple2._2.motif,startdate,endtdate,keyTable2ValueTable2Tuple2._1.client);
           return t;
       }
   });

    javaFunctions(table2).writerBuilder("Schema", "Table", mapToRow(Table2.class)).saveToCassandra();

注意:即使不同的变量是公共的,您也必须在类中添加getter。