Question

我需要使用Java代码将Byte Array值写入Cassandra。然后我将使用我的C ++程序从Cassandra中检索字节数组，然后将其反序列化。

我将写入Cassandra的字节数组由三个字节数组组成，如下所述 -

short employeeId = 32767;
long lastModifiedDate = "1379811105109L";
byte[] attributeValue = os.toByteArray();

现在，我将employeeId，lastModifiedDate和attributeValue一起编写到一个字节数组中，生成的字节数组我将写入Cassandra然后我将使用我的C ++程序它将从Cassandra中检索该字节数组数据，然后对其进行反序列化以从中提取employeeId，lastModifiedDate和attributeValue。

我不确定在写入Cassandra时是否应该在我的Java代码中使用Big Endian，以便在阅读时简化C ++代码？

我已尝试在Java端确保它在遵循某种格式（Big Endian）时将所有内容写入单个字节数组，然后将此字节数组写回Cassandra但不确定是否是对还是不对？

public static void main(String[] args) throws Exception {

    String os = "Byte Array Test";
    byte[] attributeValue = os.getBytes();

    long lastModifiedDate = 1379811105109L;
    short employeeId = 32767;

    ByteArrayOutputStream byteOsTest = new ByteArrayOutputStream();
    DataOutputStream outTest = new DataOutputStream(byteOsTest);

    // merging everything into one Byte Array here
    outTest.writeShort(employeeId);
    outTest.writeLong(lastModifiedDate);
    outTest.writeInt(attributeValue.length);
    outTest.write(attributeValue);

    byte[] allWrittenBytesTest = byteOsTest.toByteArray();

    // initially I was writing allWrittenBytesTest into Cassandra...

    ByteBuffer bb = ByteBuffer.wrap(allWrittenBytesTest).order(ByteOrder.BIG_ENDIAN);

    // now what value I should write into Cassandra?
    // or does this even looks right?

    // And now how to deserialize it?

}

有人可以帮我解决这个ByteBuffer的问题吗？感谢..

我可能会在这里查看有关字节缓冲区的详细信息，因为这是我第一次使用它。

首先，在我的用例中，我应该在这里使用ByteByffer吗？
其次，如果是，那么在我的用例中使用它的最佳方式是什么？？

我唯一要确定的是，我正在通过遵循Big-Endians字节顺序格式正确地写入Cassandra，因此在C ++方面，我在反序列化字节数组时根本没有遇到任何问题。

Answer 1

不要手动为Thrift序列化ByteBuffers，而是使用Cassandra的原生CQL驱动程序：http://github.com/datastax/java-driver

Answer 2

对于字节数组，endiness完全没有意义。因此，如果casandra不尝试解释您的数据，您可以使用是否大/小端。因此，编码仅对多字节值有意义。

如果您要将数据用于不同的客户端，并且可能使用不同的平台，我建议您采取一些协议（例如使用BIG端）并在所有客户端使用相同的耐用性。例如，java客户端代码如下所示：

ByteBuffer bb = ByteBuffer.allocate(attributeValue.length + 14).order(ByteOrder.BIG_ENDIAN);
    bb.putShort(employeeId);
    bb.putLong(lastModifiedDate);
    bb.putInt(attributeValue.length);
    bb.put(attributeValue);

如果要使用需要它的API，则必须使用ByteBuffer。例如，NIO通道可以使用ByteBuffers，因此如果要使用SocketChannel进行连接，可以使用ByteBuffer。您还可以使用ByteBuffer正确格式化多字节值。例如，对于上面的代码，您可以从缓冲区获取字节数组，并通过套接字发送它，其中3个第一个字段使用big-endian表示法打包：

sendByteArray(bb.array());
...

Answer 3

首先，我从未使用过cassandra，我只会回答ByteBuffer部分。

在发送字节之前，你应该先将所有内容放入bytebuffer中，否则你无法控制所存储内容的字节顺序，这正是使用ByteBuffer的重点。

要发送字节使用：

int size = 2 + 8 + 4 + attributeValue.length; // short is 2 bytes, long 8 and int 4

ByteBuffer bbuf = ByteBuffer.allocate(size); 
bbuf.order(ByteOrder.BIG_ENDIAN);

bbuf.putShort(employeeId);
bbuf.putLong(lastModifiedDate);
bbuf.putInt(attributeValue.length);
bbuf.put(attributeValue);

bbuf.rewind();

// this is a bad approach because if you modify the returned array
// you are directly modifying the ByteBuffer's internal array.
byte[] bytesToStore = bbuf.array();

// best approach is copy the internal buffer
byte[] bytesToStore = new byte[size];
bbuf.get(bytesToStore);

现在你可以存储bytesToStore，将它们发送到cassandra。

要阅读它们：

byte[] allWrittenBytesTest = magicFunctionToAcquireDataFromCassandra();

ByteBuffer bb = ByteBuffer.wrap(allWrittenBytesTest);
bb.order(ByteOrder.BIG_ENDIAN);
bb.rewind();

int size = allWrittenBytesTest.length - 14;
short employeeId = bb.getShort();
long lastModifiedDate = bb.getLong();
int attributeValueLen = bb.getInt();
byte[] attributeValue = new byte[size];
bb.get(attributeValue); // read attributeValue from the remaining buffer

你甚至不需要存储attributeValue长度，因为可以通过从allWrittenBytesTest.length中减去14来再次确定长度（14是其他字段大小[2 + 4 + 8]的总和）。 / p>

编辑代码，我有一些错别字。

如何使用字节缓冲区序列化字节数组以开始遵循Big Endian格式？

3 个答案: