kafka火花流数据没有写入cassandra。插入零行

时间:2014-12-09 18:31:34

标签: cassandra apache-spark apache-kafka spark-streaming

在从spark向cassandra写入数据时,数据没有被写入 闪回是:
我正在进行kafka-sparkStreaming-cassandra集成 我正在阅读kafka消息并尝试将其放入cassandra表CREATE TABLE TEST_TABLE(key INT PRIMARY KEY, value TEXT) kafka到spark-streaming运行很酷,但是cassandra的火花,有一些问题...数据没有被写入表格。
我能够使用cassandra创建连接,但数据没有插入到cassandra表中。输出显示它已连接,下一秒断开连接 System.out.print()的字符串全部在输出处。

+++++++++++cassandra connector created++++++++++++++++++++++++++++
+++++++++++++streaming Connection done!+++++++++++++++++++++++++++
++++++++++++++++JavaDStream<TestTable> created++++++++++++++++++++++++++++

Cassandra shell显示0行 完整代码以及日志和依赖项如下:

public class SparkStream {
    static int key=0;
    public static void main(String args[]) throws Exception
    {

        if(args.length != 3)
        {
            System.out.println("parameters not given properly");
            System.exit(1);
        }

        Logger.getLogger("org").setLevel(Level.OFF);
        Logger.getLogger("akka").setLevel(Level.OFF);
        Map<String,Integer> topicMap = new HashMap<String,Integer>();
        String[] topic = args[2].split(",");
        for(String t: topic)
        {
            topicMap.put(t, new Integer(3));
        }

        /* Connection to Spark */
        SparkConf conf = new SparkConf();
        conf.set("spark.cassandra.connection.host", "localhost");
        JavaSparkContext sc = new JavaSparkContext("local[4]", "SparkStream",conf);
        JavaStreamingContext jssc = new JavaStreamingContext(sc, new Duration(5000));


        /* connection to cassandra */
        CassandraConnector connector = CassandraConnector.apply(sc.getConf());
        System.out.println("+++++++++++cassandra connector created++++++++++++++++++++++++++++");


        /* Receive Kafka streaming inputs */
        JavaPairReceiverInputDStream<String, String> messages = KafkaUtils.createStream(jssc, args[0], args[1], topicMap );
        System.out.println("+++++++++++++streaming Connection done!+++++++++++++++++++++++++++");


        /* Create DStream */                
        JavaDStream<TestTable> data = messages.map(new Function< Tuple2<String,String>, TestTable >() 
        {
            public TestTable call(Tuple2<String, String> message)
            {
                return new TestTable(new Integer(++key), message._2() );
            }
        }
        );
        System.out.println("++++++++++++++++JavaDStream<TestTable> created++++++++++++++++++++++++++++");


        /* Write to cassandra */
        javaFunctions(data).writerBuilder("testkeyspace", "test_table", mapToRow(TestTable.class)).saveToCassandra();


        jssc.start();
        jssc.awaitTermination();

    }
}

class TestTable implements Serializable
{
    Integer key;
    String value;

    public TestTable() {}

    public TestTable(Integer k, String v)
    {
        key=k;
        value=v;
    }

    public Integer getKey(){
        return key;
    }

    public void setKey(Integer k){
        key=k;
    }

    public String getValue(){
        return value;
    }

    public void setValue(String v){
        value=v;
    }

    public String toString(){
        return MessageFormat.format("TestTable'{'key={0}, value={1}'}'", key, value);

    }
}

日志是:

+++++++++++cassandra connector created++++++++++++++++++++++++++++
+++++++++++++streaming Connection done!+++++++++++++++++++++++++++
++++++++++++++++JavaDStream<TestTable> created++++++++++++++++++++++++++++
14/12/09 12:07:33 INFO core.Cluster: New Cassandra host localhost/127.0.0.1:9042 added
14/12/09 12:07:33 INFO cql.CassandraConnector: Connected to Cassandra cluster: Test Cluster
14/12/09 12:07:33 INFO cql.LocalNodeFirstLoadBalancingPolicy: Adding host 127.0.0.1 (datacenter1)
14/12/09 12:07:33 INFO cql.LocalNodeFirstLoadBalancingPolicy: Adding host 127.0.0.1 (datacenter1)
14/12/09 12:07:34 INFO cql.CassandraConnector: Disconnected from Cassandra cluster: Test Cluster

14/12/09 12:07:45 INFO core.Cluster: New Cassandra host localhost/127.0.0.1:9042 added
14/12/09 12:07:45 INFO cql.CassandraConnector: Connected to Cassandra cluster: Test Cluster
14/12/09 12:07:45 INFO cql.LocalNodeFirstLoadBalancingPolicy: Adding host 127.0.0.1 (datacenter1)
14/12/09 12:07:45 INFO cql.LocalNodeFirstLoadBalancingPolicy: Adding host 127.0.0.1 (datacenter1)
14/12/09 12:07:46 INFO cql.CassandraConnector: Disconnected from Cassandra cluster: Test Cluster

POM.xml依赖项是:

   <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming-kafka_2.10</artifactId>
        <version>1.1.0</version>
    </dependency>

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming_2.10</artifactId>
        <version>1.1.0</version>
    </dependency>

<dependency>
    <groupId>com.datastax.spark</groupId>
    <artifactId>spark-cassandra-connector_2.10</artifactId>
    <version>1.1.0</version>
</dependency>
<dependency>
    <groupId>com.datastax.spark</groupId>
    <artifactId>spark-cassandra-connector-java_2.10</artifactId>
    <version>1.1.0</version>
</dependency>
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.10</artifactId>
    <version>1.1.1</version>
</dependency>


    <dependency>
        <groupId>com.msiops.footing</groupId>
        <artifactId>footing-tuple</artifactId>
        <version>0.2</version>
    </dependency>   

<dependency>
    <groupId>com.datastax.cassandra</groupId>
    <artifactId>cassandra-driver-core</artifactId>
    <version>2.1.3</version>
</dependency>

代码有问题吗?还是cassandra配置?

1 个答案:

答案 0 :(得分:1)

解决了这个问题。 columnMapper无法访问类TestTable的getter和setter。 所以将访问修饰符更改为public。 但现在我在一个文件中有2个公共类。这是一个错误。 所以创建了另一个java文件TestTable.java,类为

public class TestTable implements Serializable { 
//code
}

现在正在从kafka读取消息并将其存储在cassandra表中