Question

我有一些代码，用hbase 0.92编写：

/**
   * Writes the given scan into a Base64 encoded string.
   *
   * @param scan  The scan to write out.
   * @return The scan saved in a Base64 encoded string.
   * @throws IOException When writing the scan fails.
   */
  public static String convertScanToString(Scan scan) throws IOException {
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    DataOutputStream dos = new DataOutputStream(out);
    scan.write(dos);
    return Base64.encodeBytes(out.toByteArray());
  }

  /**
   * Converts the given Base64 string back into a Scan instance.
   *
   * @param base64  The scan details.
   * @return The newly created Scan instance.
   * @throws IOException When reading the scan instance fails.
   */
  static Scan convertStringToScan(String base64) throws IOException {
    ByteArrayInputStream bis = new  ByteArrayInputStream(Base64.decode(base64));
    DataInputStream dis = new DataInputStream(bis);
    Scan scan = new Scan();
    scan.readFields(dis);
    return scan;
  }

我需要将此代码迁移到hbase0.98.0-hadoop2。 Scan类中不再有write() readFields()。有人可以帮我解决这个问题吗？

Answer 1

在Hbase 0.92中，Scan类实现了处理序列化的Writable接口。使用Hbase 0.96时，不推荐使用此类手动序列化，而使用Google's protobuf（请参阅问题hbase-6477）。

要使用Hbase 0.96+序列化org.apache.hadoop.hbase.client.Scan，您需要先将其转换为org.apache.hadoop.hbase.protobuf.generated.ClientProtos.Scan。您可以使用两个重载的ProtobufUtil.toScan()方法从一个转换到另一个转换。 ClientProtos.Scan具有序列化和反序列化的方法，如toByteArray和parseFrom。

使用Protobuf可以将代码重写为这样的代码（不检查结果）。

写：

 public static String convertScanToString(Scan scan) throws IOException {
   return Base64.encodeBytes( ProtobufUtil.toScan(scan).toByteArray() );
 }

读：

  static Scan convertStringToScan(String base64) throws IOException {
    return ProtobufUtil.toScan(ClientProtos.Scan.parseFrom(Base64.decode(base64)));
  }

Protobuf序列化与旧版序列不兼容。如果你需要转换我建议只从Scan类中获取旧的序列化代码。

将java代码从hbase 0.92迁移到0.98.0-hadoop2

1 个答案: