Question

我正在开发一个项目，我应该将ByteArray写入文件中。然后，使用C ++程序读取相同的文件。

我写入文件的ByteArray是这三个ByteArrays的组合 -

第一个 2个字节是我schemaId，我用短数据类型代表它。
接下来 8字节是我使用长数据类型表示的Last Modified Date。
剩下的字节可以是可变大小，这是我属性的实际值..

将结果ByteArray写入文件后。现在我需要从C++ program读取该文件并读取包含ByteArray的第一行，然后相应地分割生成的ByteArray，如上所述，这样我就可以提取schemaId，{{ 1}}和我的实际Last Modified Date。

我总是用Java完成所有编码，而且我是C ++的新手...我能用C ++编写一个程序来读取文件，但不知道如何以这样的方式读取ByteArray能够像我上面提到的那样拆分..

下面是我的java代码，它会将生成的ByteArray写入一个文件，现在我需要从c ++中读取相同的文件..

attribute value

下面是我的C ++程序，它正在读取上面的文件（由Java编写），我不知道我应该做什么来以这样的方式拆分ByteArrays，以便我可以相应地读取单个的ByteArrays。

public static void main(String[] args) throws Exception {

    String os = "whatever os is";
    byte[] avroBinaryValue = os.getBytes();

    long lastModifiedDate = 1379811105109L;
    short schemaId = 32767;

    ByteArrayOutputStream byteOsTest = new ByteArrayOutputStream();
    DataOutputStream outTest = new DataOutputStream(byteOsTest);
    outTest.writeShort(schemaId);
    outTest.writeLong(lastModifiedDate);
    outTest.writeInt(avroBinaryValue.length);
    outTest.write(avroBinaryValue);

    byte[] allWrittenBytesTest = byteOsTest.toByteArray();

    DataInputStream inTest = new DataInputStream(new ByteArrayInputStream(allWrittenBytesTest));

    short schemaIdTest = inTest.readShort();

    long lastModifiedDateTest = inTest.readLong();

    int sizeAvroTest = inTest.readInt();
    byte[] avroBinaryValue1 = new byte[sizeAvroTest];
    inTest.read(avroBinaryValue1, 0, sizeAvroTest);


    System.out.println(schemaIdTest);
    System.out.println(lastModifiedDateTest);
    System.out.println(new String(avroBinaryValue1));

    writeFile(allWrittenBytesTest);
}

    /**
 * Write the file in Java
 * @param byteArray
 */
public static void writeFile(byte[] byteArray) {

    try{
        File file = new File("bytearrayfile");

        FileOutputStream output = new FileOutputStream(file);
        IOUtils.write(byteArray, output);           
    } catch (Exception ex) {
        ex.printStackTrace();
    }
}

在反序列化单个ByteArray之后，我应该能够从上面的C ++程序中将schemaId提取为#include "ReadFile.h" #include <iostream> #include <fstream> #include <string> using namespace std; int main () { string line; std::ifstream myfile("bytearrayfile", std::ios::binary); //check to see if the file is opened: if (myfile.is_open()) { //while there are still lines in the //file, keep reading: while (! myfile.eof() ) { // I am not sure what I am supposed to do here? } //close the stream: myfile.close(); } else cout << "Unable to open file"; return 0; }，将32767提取为lastModifiedDate，将我的属性值提取为1379811105109。

我是C ++的新手，因此面临很多问题。我的代码的任何示例基础将帮助我更好地理解。

任何人都可以帮助我吗？感谢。

更新： -

以下是我能够提取whatever os is，schemaId和lastModifiedDate的最新代码。

但不确定如何提取实际的属性值 -

attributeLength

Answer 1

在Java中，你的程序是

编写架构ID
写下最后修改日期
写avro二进制数据长度
编写avro二进制数据

所以在C ++中你的程序是

读取架构ID
阅读上次修改日期
读取avro二进制数据长度
阅读avro二进制数据

对于这个程序，C ++和Java之间的差别确实很小，所以如果你能用Java做到这一点，你应该（通过一些研究）能够用C ++完成它。

这是一个开始（第1项）

short schemaId;
myFile.read(reinterpret_cast<char*>(&schemaId), sizeof(short));

reinterpret_cast<char*>是必要的，因为read函数需要char*作为它的第一个参数。因此，如果第一个参数不是指向char的指针，则必须使用强制转换。

这假设sizeof(short) == 2（在Java中始终为true，在C ++中通常为true），并且没有endianess问题。很难知道这一点，你只需要尝试看看。

在读取或写入二进制整数时，Java和C ++的实现可能会使用不同的字节顺序。这称为 endianess 。如果是这种情况，则在读取整数时必须交换字节顺序。这里有一些代码可以执行此操作（这是非常繁琐的事情，可能会有更简洁的方法）。

uint16_t schemaId;
uint64_t lastModifiedDate;
uint32_t attributeLength;
char buffer[8]; // sized for the biggest read we want to do

// read two bytes (will be in the wrong order)
myfile.read(buffer, 2);
// swap the bytes
std::swap(buffer[0], buffer[1]);
// only now convert bytes to an integer
schemaId = *reinterpret_cast<uint16_t*>(buffer);

// read eight bytes (will be in the wrong order)
myfile.read(buffer, 8);
// swap the bytes
std::swap(buffer[0], buffer[7]);
std::swap(buffer[1], buffer[6]);
std::swap(buffer[2], buffer[5]);
std::swap(buffer[3], buffer[4]);
// only now convert bytes to an integer
lastModifiedDate = *reinterpret_cast<uint64_t*>(buffer);

等...

您需要#include <algorithm>才能获得std::swap功能。

如何通过读取文件从C ++反序列化ByteArrays

1 个答案: