Hadoop和Cassandra - InvalidRequestException(原因:需要列时间戳)

时间:2013-05-10 11:02:12

标签: hadoop cassandra

我在Cassandra集群上运行了一个简单的mapred作业,但是当它尝试将输出保存到表时,我得到InvalidRequestException(原因:需要列时间戳)。

我尝试手动为CF添加'timestamp'列,但它没有任何区别。

这是我的CF的描述(由cqlsh解释):

CREATE TABLE output_words (
  key text PRIMARY KEY,
  "count" int,
) WITH COMPACT STORAGE AND
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=0.100000 AND
  replicate_On_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};

我在Cassandra v1.2.4上使用了带有hadoop-core v1.1.2和cassandra-thrift v1.2.4的POM

有人可以建议如何解决这个问题吗?

其他信息

我按如下方式配置我的作业(仅显示与输出相关的配置):

Job job = new Job(getConf(), "wordcount");

job.setJarByClass(TestJob.class);
job.setMapperClass(TokenizerMapper.class);
job.setReducerClass(ReducerToCassandra.class);

job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
job.setOutputKeyClass(ByteBuffer.class);
job.setOutputValueClass(List.class);

job.setOutputFormatClass(ColumnFamilyOutputFormat.class);

ConfigHelper.setOutputColumnFamily(job.getConfiguration(), _keyspace, OUTPUT_COLUMN_FAMILY);

ConfigHelper.setOutputRpcPort(job.getConfiguration(), _port);
ConfigHelper.setOutputInitialAddress(job.getConfiguration(), _host);
ConfigHelper.setOutputPartitioner(job.getConfiguration(), "org.apache.cassandra.dht.Murmur3Partitioner");

我的减速机课程:

public static class ReducerToCassandra extends Reducer<Text, IntWritable, ByteBuffer, List<Mutation>>
{
    public void reduce(Text word, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
        int sum = 0;
        for (IntWritable val : values) {
            sum += val.get();
        }
        context.write(StringSerializer.get().toByteBuffer(word.toString()), Collections.singletonList(getMutation(word, sum)));
    }

    private static Mutation getMutation(Text word, int sum) {
        Column c = new Column();
        c.name = StringSerializer.get().toByteBuffer("count");
        c.value = IntegerSerializer.get().toByteBuffer(sum);
        c.timestamp = System.currentTimeMillis() * 1000;

        Mutation m = new Mutation();
        m.column_or_supercolumn = new ColumnOrSuperColumn();
        m.column_or_supercolumn.column = c;
        return m;
    }
}

1 个答案:

答案 0 :(得分:1)

而不是这个

c.timestamp = System.currentTimeMillis() * 1000;

你能试试吗

c.setTimestamp(System.currentTimeMillis() * 1000)