我有一个单节点cassandra集群, NOT 与我的hadoop集群共存。我正在使用这个hadoop集群使用CqlOutputFormat将数据批量加载到Cassandra集群中,并具有以下作业配置。
Job job = new Job(conf);
Configuration jobConf = job.getConfiguration();
ConfigHelper.setOutputInitialAddress(jobConf, "10.27.124.73");
ConfigHelper.setInputInitialAddress(jobConf, "10.27.124.73");
//ConfigHelper.setOutputRpcPort(jobConf, "9160"); I tried setting the ports. Doesn't help.
//ConfigHelper.setInputRpcPort(jobConf, "9160");
ConfigHelper.setOutputPartitioner(jobConf,"Murmur3Partitioner");
ConfigHelper.setInputPartitioner(jobConf, "Murmur3Partitioner");
ConfigHelper.setInputColumnFamily(jobConf, "pinspace", "pinseries");
ConfigHelper.setOutputColumnFamily(jobConf, "pinspace", "pinseries");
System.out.println(ConfigHelper.getOutputColumnFamily(jobConf));
String query = "update pinspace.pinseries set timeseries = ?";
CqlConfigHelper.setOutputCql(jobConf, query);
CqlConfigHelper.setInputCQLPageRowSize(jobConf, "3");
job.setInputFormatClass(BoomInputFormat.class);
job.setMapperClass(GrepMapper.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setReducerClass(GrepReducer.class);
job.setNumReduceTasks(1024);
job.setOutputFormatClass(CqlOutputFormat.class);
我通过以下stacktrace获得Connection refused错误:
Error: java.io.IOException: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at org.apache.cassandra.hadoop.cql3.CqlRecordWriter$RangeClient.run(CqlRecordWriter.java:280)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused
at org.apache.thrift.transport.TSocket.open(TSocket.java:187)
at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
at org.apache.cassandra.thrift.TFramedTransportFactory.openTransport(TFramedTransportFactory.java:41)
at org.apache.cassandra.hadoop.AbstractColumnFamilyOutputFormat.createAuthenticatedClient(AbstractColumnFamilyOutputFormat.java:123)
at org.apache.cassandra.hadoop.cql3.CqlRecordWriter$RangeClient.run(CqlRecordWriter.java:271)
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
... 4 more
我正在使用cassandra-driver-core 2.1.5和cassandra-all 2.1.3。
服务器可使用来自数据节点的给定IP进行ping,并且telnet 10.27.124.73 9160和telnet 10.27.124.73 9042可以正常工作。
编辑:
我尝试在代码中设置9042端口, ConfigHelper.setOutputRpcPort(jobConf,“9042”);
然后错误更改为:
Error: java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Read a negative frame size (-2097152000)!
at org.apache.cassandra.hadoop.cql3.CqlRecordWriter.<init>(CqlRecordWriter.java:130)
at org.apache.cassandra.hadoop.cql3.CqlRecordWriter.<init>(CqlRecordWriter.java:88)
at org.apache.cassandra.hadoop.cql3.CqlOutputFormat.getRecordWriter(CqlOutputFormat.java:74)
at org.apache.cassandra.hadoop.cql3.CqlOutputFormat.getRecordWriter(CqlOutputFormat.java:55)
at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.<init>(ReduceTask.java:540)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:614)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: org.apache.thrift.transport.TTransportException: Read a negative frame size (-2097152000)!
at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:133)
at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
at org.apache.cassandra.thrift.Cassandra$Client.recv_set_keyspace(Cassandra.java:608)
at org.apache.cassandra.thrift.Cassandra$Client.set_keyspace(Cassandra.java:595)
at org.apache.cassandra.hadoop.cql3.CqlRecordWriter.<init>(CqlRecordWriter.java:108)
... 11 more