Question

我已使用Hbase Export实用工具进行了Hbase表备份。

hbase org.apache.hadoop.hbase.mapreduce.Export "FinancialLineItem" "/project/fricadev/ESGTRF/EXPORT"

这已经踢了mapreduce并将我的所有表数据传输到Output文件夹。根据文档，ouotput文件的文件格式是序列文件。所以我运行下面的代码从文件中提取我的密钥和值。

现在我想运行mapreduce来从输出文件中读取键值，但是要低于异常

java.lang.Exception：java.io.IOException：找不到 Value类的反序列化器：＆＃39; org.apache.hadoop.hbase.client.Result＆＃39 ;.请确保配置＆＃39; io.serializations＆＃39;已正确配置，如果你是使用自定义序列化。在org.apache.hadoop.mapred.LocalJobRunner $ Job.run（LocalJobRunner.java:406）引起：java.io.IOException：找不到Value类的反序列化器：＆＃39; org.apache.hadoop.hbase.client.Result＆＃39;。请确保配置＆＃39; io.serializations＆＃39;是对的已配置，如果您正在使用自定义序列化。在org.apache.hadoop.io.SequenceFile $ Reader.init（SequenceFile.java:1964）在org.apache.hadoop.io.SequenceFile $ Reader.initialize（SequenceFile.java:1811）在org.apache.hadoop.io.SequenceFile $ Reader。（SequenceFile.java:1760）在org.apache.hadoop.io.SequenceFile $ Reader。（SequenceFile.java:1774） at org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize（SequenceFileRecordReader.java:50） at org.apache.hadoop.mapred.MapTask $ NewTrackingRecordReader.initialize（MapTask.java:478）在org.apache.hadoop.mapred.MapTask.runNewMapper（MapTask.java:671）在org.apache.hadoop.mapred.MapTask.run（MapTask.java:330）

这是我的驱动程序代码

package SEQ;

import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
public class SeqDriver extends Configured implements Tool 
{
    public static void main(String[] args) throws Exception{
        int exitCode = ToolRunner.run(new SeqDriver(), args);
        System.exit(exitCode);
    }

    public int run(String[] args) throws Exception {
        if (args.length != 2) {
            System.err.printf("Usage: %s needs two arguments   files\n",
                    getClass().getSimpleName());
            return -1;
        }
        String outputPath = args[1];

        FileSystem hfs = FileSystem.get(getConf());
        Job job = new Job();
        job.setJarByClass(SeqDriver.class);
        job.setJobName("SequenceFileReader");

        HDFSUtil.removeHdfsSubDirIfExists(hfs, new Path(outputPath), true);

        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));

        job.setOutputKeyClass(ImmutableBytesWritable.class);
        job.setOutputValueClass(Result.class);
        job.setInputFormatClass(SequenceFileInputFormat.class);

        job.setMapperClass(MySeqMapper.class);

        job.setNumReduceTasks(0);


        int returnValue = job.waitForCompletion(true) ? 0:1;

        if(job.isSuccessful()) {
            System.out.println("Job was successful");
        } else if(!job.isSuccessful()) {
            System.out.println("Job was not successful");           
        }

        return returnValue;
    }
}

这是我的映射器代码

package SEQ;

import java.io.IOException;

import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class MySeqMapper extends Mapper <ImmutableBytesWritable, Result, Text, Text>{

    @Override
    public void map(ImmutableBytesWritable row, Result value,Context context)
    throws IOException, InterruptedException {
    }
  }

Answer 1

所以我会回答我的问题这是让它发挥作用所需的东西

因为我们使用HBase来存储我们的数据，并且这个reducer将其结果输出到HBase表，Hadoop告诉我们他不知道如何序列化我们的数据。这就是为什么我们需要帮助它。在setUp里面设置io.serializations变量

hbaseConf.setStrings("io.serializations", new String[]{hbaseConf.get("io.serializations"), MutationSerialization.class.getName(), ResultSerialization.class.getName()});

在Hbase上运行MapReduce导出表thorws无法为Value类找到反序列化器：＆＃39; org.apache.hadoop.hbase.client.Result

1 个答案: