为什么Apache Orc RecordReader.searchArgument()无法成功?

时间:2019-01-18 05:42:59

标签: java

在读取时设置的谓词被下推,但从打印结果看来似乎不起作用,因为打印结果全部打印出来了,这不是我想要的

我参考以下解决方案,但是没有解决方案 enter link description here为什么?

谢谢!

public class parseOrcFile {

    public static void main(String[] args) {
        Configuration conf = new Configuration();
        try {
            String file_path = "/apps/hive/warehouse/orc_stu/test.orc";
            ReaderOptions readerOptions = OrcFile.readerOptions(conf);
            Path path = new Path(file_path);
            Reader reader = OrcFile.createReader(path, readerOptions);
            List<StripeInformation> sis = reader.getStripes();
            TypeDescription schema = reader.getSchema();
            SearchArgument sarg = SearchArgumentFactory.newBuilder()
                        .startNot()
                        .lessThan("id", PredicateLeaf.Type.LONG, 100L)
                        .end()
                        .startAnd()
                        .lessThan("id", PredicateLeaf.Type.LONG, 200L)
                        .end()
                        .build();

            Reader.Options opt = reader.options()
                        .schema(schema)
                        .include(new boolean[]{true, true, true, true, true})
                        .searchArgument(sarg, new String[]{null, "id", "name", "age", "sex"});

            RecordReader read_row_opt = reader.rows(opt);
            VectorizedRowBatch rowBatch = schema.createRowBatch();
            while (read_row_opt.nextBatch(rowBatch)) {
                System.out.println(rowBatch.toString());
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

1 个答案:

答案 0 :(得分:1)

SearchArgument filters仅用于文件,条带,行组。它不会过滤行组中的行。