Question

我收到的错误似乎来自我的自定义过滤器。错误如下：

Caused by: org.apache.hadoop.hbase.exceptions.DeserializationException: parseFrom called on base Filter, but should be called on derived type

我发现Hbase 0.98不再支持write和readFields。目前，我的write和readFields方法如下：

 public MyCustomFilter(Schema first) {
     this.schema = first;
     filterNow();
 }

public void write(DataOutput o) throws IOException {
  byte[] firstBytes = Bytes.toBytes(first.toString());
  out.writeInt(firstBytes.length)
}

public void readFields(DataInput i) throws IOException {
  int firstLength = i.readInt();
  byte firstBytes = new byte[firstLength];
  i.readFully(firstBytes, 0, firstLength);
  this.first = new Schema.Parser().parse(new ByteArrayINputStream(firstBytes);
  filterNow();
}

private void filterNow() {
   FilterQueryParser parser = new FilterQueryParser(first);
   ....
}

Cloudera似乎认为只是迁移这些方法的问题：

FilterBase no longer implements Writable. This means that you do not need to implement readFields() and write() methods when writing your own custom fields. Instead, put this logic into the toByteArray and parseFrom methods. See this page for an example.

然而，SingleColumnValueFilter提供的示例似乎使用来自Filter.Protos的ProtoBufs，它似乎包含一个作为HBase核心的SingleColumnValueFilter ...我的CustomFilter不使用任何类型，我根本不使用protobufs。有没有办法将我所拥有的内容转换为使hbase 0.98满意的内容？ O（Schema.parser是avro）我现在需要使用Filter.Protos吗？如果是这样，怎么样？

Answer 1

您必须实现方法public byte[] toByteArray()而不是public void write(DataOutput o)来序列化客户端上的过滤器实例以发送到HBase服务器端。

同样，您必须实现方法public static Filter parseFrom(final byte[] pbBytes)而不是public void readFields(DataInput i)，以便可以在服务器端读取字节流，并通过HBase在过滤器的实例中进行翻译。

不幸的是，我们似乎丢失了有用的对象DataInput和DataOutput，这些对象在0.94中可用。我们现在必须处理原始字节数组。例如，要编写一个String而不是利用DataOutput.writeUTF和DataInput.readUTF，我们必须在字节数组中的字符串之前“手动”写入一个int，以了解即将到来的String所关注的字节数。

无论如何，对HBase 1.0.2上的自定义过滤器进行序列化和取消过滤过滤器的“手动”字节数组处理。

如何将自定义过滤器从HBase 0.94转换为不使用任何原型的HBase 0.98？

1 个答案: