C ++回读"错误"二进制文件中的值?

时间:2015-12-08 16:30:42

标签: c++ file input output

我正在处理的项目,作为一个自定义文件格式,由几个不同变量的标题组成,后跟像素数据。我的同事开发了一个GUI,处理,写入阅读和显示这种类型的文件格式工作正常。

但我的问题是,虽然我已经协助编写了将数据写入磁盘的代码,但我不能自己阅读这种文件并获得令人满意的值。我能够读取第一个变量back(char数组)但不能读取以下值。

因此文件格式符合以下结构:

typedef struct {
    char hxtLabel[8];
    u64 hxtVersion;
    int motorPositions[9];
    int filePrefixLength;
    char filePrefix[100];
..
} HxtBuffer;

在代码中,我创建了一个上述结构的对象,然后设置这些示例值:

setLabel("MY_LABEL");
setFormatVersion(3);
setMotorPosition( 2109, 5438, 8767, 1234, 1022, 1033, 1044, 1055, 1066);
setFilePrefixLength(7);
setFilePrefix( string("prefix_"));
setDataTimeStamp( string("000000_000000"));

我打开文件的代码:

// Open data file, binary mode, reading
ifstream datFile(aFileName.c_str(), ios::in | ios::binary);
if (!datFile.is_open()) {
    cout  << "readFile() ERROR: Failed to open file " << aFileName << endl;
    return false;
}

// How large is the file?
datFile.seekg(0, datFile.end);
int length =  datFile.tellg();
datFile.seekg(0, datFile.beg);

cout << "readFile() file " << setw(70) << aFileName << " is: " << setw(15) << length  << " long\n";

// Allocate memory for buffer:
char * buffer = new char[length];
// Read data as one block:
datFile.read(buffer, length);
datFile.close();

/// Looking at the start of the buffer, I should be seeing  "MY_LABEL"?

cout << "buffer: " << buffer  << " " << *(buffer)  << endl;

int* mSSX = reinterpret_cast<int*>(*(buffer+8));
int* mSSY = reinterpret_cast<int*>(&buffer+9);
int* mSSZ = reinterpret_cast<int*>(&buffer+10);
int* mSSROT = reinterpret_cast<int*>(&buffer+11);
int* mTimer = reinterpret_cast<int*>(&buffer+12);
int* mGALX = reinterpret_cast<int*>(&buffer+13);
int* mGALY = reinterpret_cast<int*>(&buffer+14);
int* mGALZ = reinterpret_cast<int*>(&buffer+15);
int* mGALROT = reinterpret_cast<int*>(&buffer+16);
int* filePrefixLength = reinterpret_cast<int*>(&buffer+17);

std::string filePrefix;   std::string dataTimeStamp;

// Read file prefix character by character into stringstream object
std::stringstream ss;
char* cPointer = (char *)(buffer+18);
int k;
for(k = 0; k < *filePrefixLength; k++)
{
    //read string
    char c;
    c = *cPointer;
    ss << c;
    cPointer++;
}
filePrefix = ss.str();

// Read timestamp character by character into stringstream object
std::stringstream timeStampStream;
/// Need not increment cPointer, already pointing @ 1st char of timeStamp
for (int l= 0; l < 13; l++)
{
    char c;
    c = * cPointer;
    timeStampStream << c;
}
dataTimeStamp = timeStampStream.str();

cout << 25 << endl;
cout << " mSSX:   "  << mSSX <<   "  mSSY:   "  << mSSY <<      "  mSSZ: "  << mSSZ;
cout << " mSSROT: "  << mSSROT << "  mTimer: "  << mTimer <<    "  mGALX: "  << mGALX;
cout << " mGALY:  "  << mGALY <<  "  mGALZ:  "  << mGALZ <<     "  mGALROT: "  << mGALROT;

最后,我看到的是下面的内容。我添加了25只是为了仔细检查,并非所有内容都以十六进制形式出现。如你所见,我能够看到标签&#34; MY_LABEL&#34;正如所料。但是9个motorPositions看起来都很可疑,因为地址不是值。

是文件前缀和数据时间戳(应该是字符串,或者至少是字符)
buffer: MY_LABEL M
25
 mSSX:   0000000000000003  mSSY:   00000000001BF618  mSSZ: 00000000001BF620 mSSROT: 00000000001BF628  mTimer: 00000000001BF630  mGALX: 00000000001BF638 mGALY:  00000000001BF640  mGALZ:  00000000001BF648  mGALROT: 00000000001BF650filePrefix: dataTimeStamp: 

我确定解决方案不会太复杂,但我达到了一个阶段,我只是在旋转,我无法理解事物。

非常感谢您阅读这篇有点长篇文章。

- 编辑 -

我可能会达到帖子允许的最大长度,但万一我认为我会发布生成我试图回读的数据的代码:

bool writePixelOutput(string aOutputPixelFileName) {

    // Write pixel histograms out to binary file
    ofstream pixelFile;
    pixelFile.open(aOutputPixelFileName.c_str(), ios::binary | ios::out | ios::trunc);
    if (!pixelFile.is_open()) {
        LOG(gLogConfig, logERROR) << "Failed to open output file " << aOutputPixelFileName;
        return false;
    }

    // Write binary file header

    string label("MY_LABEL");
    pixelFile.write(label.c_str(), label.length());

    pixelFile.write((const char*)&mFormatVersion, sizeof(u64));

    // Include File Prefix/Motor Positions/Data Time Stamp - if format version > 1
    if (mFormatVersion > 1)
    {
        pixelFile.write((const char*)&mSSX, sizeof(mSSX));
        pixelFile.write((const char*)&mSSY, sizeof(mSSY));
        pixelFile.write((const char*)&mSSZ, sizeof(mSSZ));
        pixelFile.write((const char*)&mSSROT, sizeof(mSSROT));
        pixelFile.write((const char*)&mTimer, sizeof(mTimer));
        pixelFile.write((const char*)&mGALX, sizeof(mGALX));
        pixelFile.write((const char*)&mGALY, sizeof(mGALY));
        pixelFile.write((const char*)&mGALZ, sizeof(mGALZ));
        pixelFile.write((const char*)&mGALROT, sizeof(mGALROT));

        // Determine length of mFilePrefix string
        int filePrefixSize = (int)mFilePrefix.size();

        // Write prefix length, followed by prefix itself
        pixelFile.write((const char*)&filePrefixSize, sizeof(filePrefixSize));

        size_t prefixLen = 0;
        if (mFormatVersion == 2)    prefixLen = mFilePrefix.size();
        else                        prefixLen = 100;
        pixelFile.write(mFilePrefix.c_str(), prefixLen);

        pixelFile.write(mDataTimeStamp.c_str(), mDataTimeStamp.size());
    }
    // Continue writing header information that is common to both format versions
    pixelFile.write((const char*)&mRows, sizeof(mRows));
    pixelFile.write((const char*)&mCols, sizeof(mCols));
    pixelFile.write((const char*)&mHistoBins, sizeof(mHistoBins));

    // Write the actual data - taken out for briefy sake
    // ..

    pixelFile.close();

    LOG(gLogConfig, logINFO) << "Written output histogram binary file " << aOutputPixelFileName;

    return true;
}

- 编辑2(2015年12月9日11:32) -

感谢您的帮助,我现在更接近解决问题了。根据muelleth的回答,我试试:

/// Read into char buffer
char * buffer = new char[length];
datFile.read(buffer, length);// length determined by ifstream.seekg()

/// Let's try HxtBuffer
HxtBuffer *input = new HxtBuffer;
cout << "sizeof HxtBuffer:  " << sizeof *input << endl;
memcpy(input, buffer, length); 

然后我可以显示不同的结构变量:

qDebug() << "Slice BUFFER label " << QString::fromStdString(input->hxtLabel);
qDebug() << "Slice BUFFER version " << QString::number(input->hxtVersion);
qDebug() << "Slice BUFFER hxtPrefixLength " << QString::number(input->filePrefixLength);
for (int i = 0; i < 9; i++)
{
    qDebug() << i << QString::number(input->motorPositions[i]);
}
qDebug() << "Slice BUFFER filePrefix " << QString::fromStdString(input->filePrefix);
qDebug() << "Slice BUFFER dataTimeStamp " << QString::fromStdString(input->dataTimeStamp);
qDebug() << "Slice BUFFER nRows " << QString::number(input->nRows);
qDebug() << "Slice BUFFER nCols " << QString::number(input->nCols);
qDebug() << "Slice BUFFER nBins " << QString::number(input->nBins);

输出大部分都符合预期:

Slice BUFFER label  "MY_LABEL" 
Slice BUFFER version  "3" 
Slice BUFFER hxtPrefixLength  "2" 
0 "2109" 
1 "5438" 
...
7 "1055" 
8 "1066" 
Slice BUFFER filePrefix  "-1" 
Slice BUFFER dataTimeStamp  "000000_000000P" 
Slice BUFFER nRows  "20480" 
Slice BUFFER nCols  "256000" 
Slice BUFFER nBins  "0" 

除了dataTimeStamp,这是13个字符长,显示14个字符。以下3个变量:nRowsnColsnBins则不正确。 (应该是nRows = 80,nCols = 80,nBins = 1000)。我的猜测是属于dataTimeStamp的第14个字符的位应与nRows一起阅读,因此级联打开以生成正确的nColsnBins

我已经使用qDebug单独验证(此处未显示)我写入文件的内容,实际上是我期望的值以及它们各自的大小。

2 个答案:

答案 0 :(得分:1)

我个人会尝试从文件中准确读取结构的字节数,例如

int length = sizeof(HxtBuffer);

然后只需使用memcpy从读缓冲区分配本地结构:

HxtBuffer input;
memcpy(&input, buffer, length);

然后,您可以访问您的数据,例如像:

std::cout << "Data: " << input.hxtLabel << std::endl;

答案 1 :(得分:1)

为什么读取缓冲区而不是使用结构进行读取?

HxtBuffer data;
datFile.read(reinterpret_cast<char *>(&data), sizeof data);
if(datFile && datFile.gcount()!=sizeof data)
    throw io_exception();

// Can use data.

如果要读取字符缓冲区,那么获取数据的方式是错误的。你可能想做这样的事情。

char *buf_offset=buffer+8+sizeof(u64);  // Skip label (8 chars) and version (int64)
int mSSX = *reinterpret_cast<int*>(buf_offset);
buf_offset+=sizeof(int);
int mSSY = *reinterpret_cast<int*>(buf_offset);
buf_offset+=sizeof(int);
int mSSZ = *reinterpret_cast<int*>(buf_offset);
/* etc. */

或者,好一点(假设您没有更改缓冲区的内容)。

int *ptr_motors=reinterpret_cast<int *>(buffer+8+sizeof(u64));
int &mSSX = ptr_motors[0];
int &mSSY = ptr_motors[1];
int &mSSZ = ptr_motors[2];
/* etc. */

请注意,我并未将mSSXmSSY等声明为指针。您的代码将它们打印为地址,因为告诉编译器它们是地址(指针)。