例如:我有如下的avsc文件。
[{ “ type”:“记录”, “ namespace”:“ com.example”, “ name”:“客户”, “字段”:[ {“ name”:“ first_name”,“ type”:“ string”,“ doc”:“客户的名字”}, {“ name”:“ last_name”,“ type”:“ string”,“ doc”:“客户姓氏”}, {“ name”:“ age”,“ type”:“ int”,“ doc”:“注册时的年龄”}, {“名称”:“高度”,“类型”:“浮点数”,“文档”:“注册时的高度,以厘米为单位”}, {“名称”:“重量”,“类型”:“浮动”,“ doc”:“注册时的重量,以千克为单位”}, {“ name”:“ automated_email”,“ type”:“ boolean”,“ default”:true,“ doc”:“该字段指示用户是否注册了营销电子邮件”} ] }
{ “ type”:“记录”, “ namespace”:“ com.example”, “ name”:“客户”, “字段”:[ {“名称”:“客户”,“类型”:{“类型”:“数组”,“项目”:“ com.example.Customer”},“ doc”:“注册时的年龄”} ] }]
我有几个客户,并已添加到客户
Customer.Builder customerBuilder1 = Customer.newBuilder();
customerBuilder1.setAge(30);
customerBuilder1.setFirstName("Mark");
customerBuilder1.setLastName("Simpson");
customerBuilder1.setAutomatedEmail(true);
customerBuilder1.setHeight(180f);
customerBuilder1.setWeight(90f);
Customer.Builder customerBuilder2 = Customer.newBuilder();
customerBuilder2.setAge(30);
customerBuilder2.setFirstName("Vishant");
customerBuilder2.setLastName("Shah");
customerBuilder2.setAutomatedEmail(true);
customerBuilder2.setHeight(181f);
customerBuilder2.setWeight(65f);
Customer customer1 = customerBuilder1.build();
System.out.println("Original : " +customer1.toString());
Customer customer2 = customerBuilder2.build();
System.out.println("Original : " + customer2.toString());
Customers.Builder customersBuilder = Customers.newBuilder();
customersBuilder.setCustomers(Arrays.asList(customer1, customer2));
Customers customers = customersBuilder.build();
//Write parquet file
try (ParquetWriter<Customers> writer = AvroParquetWriter
.<Customers>builder(new Path("customers-specific.parquet"))
.withSchema(customers.getSchema())
.withConf(new Configuration())
.withCompressionCodec(CompressionCodecName.SNAPPY)
.build()) {
writer.write(customers);
}
我如何将谓词应用于阵列上“客户的名字”列表中。没有复杂的对象,它本来很简单,但不适用于该数组。
FilterPredicate predicate = eq(binaryColumn("first_name"), Binary.fromString("Vishant"));
try (ParquetReader<Customer> selectiveReader = AvroParquetReader.<Customers>builder(new Path("customer-specific.parquet"))
.withFilter(FilterCompat.get(predicate))
.build()) {
Customer selectedCustomer;
while ((selectedCustomer = selectiveReader.read()) != null) {
System.out.println("Selected Read" + selectedCustomer.toString());
}
}