SequenceFile转换为.txt

时间:2012-05-29 01:47:14

标签: java hadoop

有没有办法将Sequence文件转换为.txt文件?序列文件是在hadoop作业之后生成的,当我尝试使用SequenceFileReader读取它时,它会给我一个EOFException,尽管作业成功完成。所以我虽然可以将序列文件复制到本地系统,然后在可能的情况下转换为txt格式。

1 个答案:

答案 0 :(得分:1)

将文件从seq更改为文本不是一个正确的解决方案..看看问题..你可以尝试这样的东西来读取键/值对 -

public class SequenceFileReader {
    public static void main(String args[]) throws Exception {
        System.out.println("Readeing Sequence File");
        Configuration conf = new Configuration();
        conf.addResource(new Path("/home/mohammad/hadoop-0.20.203.0/conf/core-site.xml"));
        conf.addResource(new Path("/home/mohammad/hadoop-0.20.203.0/conf/hdfs-site.xml"));  
        FileSystem fs = FileSystem.get(conf);
        Path path = new Path("/seq/file");
        SequenceFile.Reader reader = null;      
        try {
            reader = new SequenceFile.Reader(fs, path, conf);
            Writable key = (Writable) ReflectionUtils.newInstance(reader.getKeyClass(), conf);
            Writable value = (Writable) ReflectionUtils.newInstance(reader.getValueClass(), conf);
            while (reader.next(key, value)) {
                System.out.println(key + "  <===>  " + value.toString());
            }
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            IOUtils.closeStream(reader);
        }
    }
}

您可以使用“hadoop fs -text seqfile”命令将seq文件转换为文本文件,但是...