Hbase多个ValueFilter用于过滤同一行的多个列

时间:2018-02-05 08:10:59

标签: hbase

我想使用HBase API实现类似SQL查询的东西

SELECT * FROM customer_table WHERE firstname = "Joe" AND lastname = "Bloggs" AND email = "joe@blah.com" 

HBase表:

1                column=p:firstname, timestamp=<t>, value=Joe                                                                         
1                column=p:lastname, timestamp=<t>, value=Bloggs                                                                            
1                column=p:email, timestamp=<t>, value=joe@blah.com                                                                            
2                column=p:firstname, timestamp=<t>, value=Joe                                                                         
2                column=p:lastname, timestamp=<t>, value=Bloggs                                                                            
2                column=p:email, timestamp=<t>, value=joe@blah.com
3                column=p:firstname, timestamp=<t>, value=Joe                                                                         
3                column=p:lastname, timestamp=<t>, value=Bloggs                                                                            
3                column=p:email, timestamp=<t>, value=joe@blah.com

目前我有这个:

val filters = Array("Joe", "Bloggs", "joe@blah.com")

// AND operator
val filterList = new FilterList(FilterList.Operator.MUST_PASS_ALL) 

filters.foreach(f => {
  filterList.addFilter(new ValueFilter(CompareOp.EQUAL, new BinaryComparator(Bytes.toBytes(f))))
})

val scan = new Scan().setFilter(filterList)
val resultScanner = table.getScanner(scan)

但是,这不会返回任何结果。我希望它能返回所有3行。是否有其他过滤器/功能来实现这一目标?

1 个答案:

答案 0 :(得分:0)

This回答帮助了我。我需要使用ValueFilter而不是SingleColumnValueFilter,因为它是一个AND操作,您需要为需要过滤的每个字段创建一个列值过滤器,val filterMap = Map("firstname" -> "Joe", "lastname" -> "Bloggs", "email" -> "joe@blah.com") // AND operator val filterList = new FilterList(FilterList.Operator.MUST_PASS_ALL) filterMap.foreach(kv => { filterList.addFilter(new SingleColumnValueFilter ( Bytes.toBytes(columnFamily), Bytes.toBytes(kv._1), CompareOp.EQUAL, new BinaryComparator(Bytes.toBytes(kv._2)) ) ) }) val scan = new Scan().setFilter(filterList) val resultScanner = table.getScanner(scan) 执行此操作很好。

if self.load_former_ptf:
    for k, v in context.former_portfolio.__dict__.items():
        self.TradingAlgorithm.portfolio.__setattr__(k, v)

    updPositionDict = {}
    for p in context.former_portfolio.positions.values():
        formerDelta = p.amount*p.last_sale_price
        newSid = context.symbol(p.sid.symbol)
        newPrice = data[newSid].price
        newQuantity = int(formerDelta/newPrice)
        # portfolio should be made of positions instead of plain dict
        updPositionDict.update({newSid:{'amount':newQuantity, 'cost_basis':p.cost_basis,
                                        'last_sale_date':p.last_sale_price, 'last_sale_price':newPrice,
                                        'sid':newSid}})
    self.TradingAlgorithm.portfolio.positions = updPositionDict
    self.load_former_ptf = False

按预期返回3行。