C ++不可能的指针行为

时间:2014-01-26 10:53:45

标签: c++ pointers memory-management

我正在编写像Java这样的东西,我的指针有问题( - - ) 我有一个结构:

struct _lnHeader32
{ unsigned char signature[2]; //LN
  unsigned char architecture;
  unsigned int length; //Without _lnHeader
  unsigned int lnHeaderLength;
  unsigned char permissions;
  unsigned char typeOfExecutable;
  unsigned int flowSegment;
  unsigned int dataSegment;
  unsigned int loaderSegment;
  unsigned int cleanerSegment;
  unsigned int errorSegment;
  unsigned int exportTable;
  unsigned int importTable;
  unsigned int authenticationTable; //Encrypt it with GPG.
  unsigned int loaderTable;
};

我使用std :: fstream:

加载可执行文件,它是little-endian
lnFile.open(argv[1], std::fstream::in | std::fstream::binary);
if (false == lnFile.is_open())
 throw (unableToOpen);
lnSize = getFileSize(lnFile);
lnImage = new (std::nothrow) unsigned char [lnSize];
if (0 == lnImage)
 throw (noMem);
lnFile.read(reinterpret_cast<char*>(lnImage), lnSize); //#1 Possible mistake?
if (!lnFile)
 throw (unableToRead);
lnFile.close();

然后我将_lnHeader32 *指向已分配的lnImage:

lnHeader32 = reinterpret_cast<_lnHeader32*>(lnImage);

在最后,我用两种方法打印整个结构:

//Method 1
std::cout << reinterpret_cast<unsigned int*>(lnImage) << "\n";
std::cout << reinterpret_cast<unsigned int*>(lnImage+2) << "\n";
std::cout << reinterpret_cast<unsigned int*>(lnImage+3) << "\n";
std::cout << reinterpret_cast<unsigned int*>(lnImage+7) << "\n";
std::cout << reinterpret_cast<unsigned int*>(lnImage+11) << "\n";
std::cout << reinterpret_cast<unsigned int*>(lnImage+12) << "\n\n";

//Method 2    
std::cout << reinterpret_cast<unsigned int*>(&lnHeader32->signature) << "\n";
std::cout << reinterpret_cast<unsigned int*>(&lnHeader32->architecture) << "\n";
std::cout << reinterpret_cast<unsigned int*>(&lnHeader32->length) << "\n";
std::cout << reinterpret_cast<unsigned int*>(&lnHeader32->lnHeaderLength) << "\n";
std::cout << reinterpret_cast<unsigned int*>(&lnHeader32->permissions) << "\n";
std::cout << reinterpret_cast<unsigned int*>(&lnHeader32->typeOfExecutable) 
                                                                  << "\n\n";

它给我的输出如下:

0xe8b260
0xe8b262
0xe8b263 <---
0xe8b267
0xe8b26b
0xe8b26c

0xe8b260
0xe8b262
0xe8b264 <--- Should be 0xe8b263 | Here starts problem
0xe8b268
0xe8b26c
0xe8b26d

使用第一种方法打印好lnHeader32的字段,但我更喜欢使用第二种方法。我计算了几次。为什么会这样? 可执行文件是由编译器在perl中生成的,对它有影响吗?

2 个答案:

答案 0 :(得分:3)

由于填充(请参阅https://en.wikipedia.org/wiki/Data_padding),您的结构实际上如下所示:

struct _lnHeader32 {
  unsigned char signature[2]; //LN
  unsigned char architecture;
  unsigned char PADDING[1];//so next member will be aligned by 4
  unsigned int length; //Without _lnHeader
  unsigned int lnHeaderLength;
  unsigned char permissions;
  unsigned char typeOfExecutable;
  unsigned char PADDING[2];//so next member will be aligned by 4
  unsigned int flowSegment;
  unsigned int dataSegment;
  unsigned int loaderSegment;
  unsigned int cleanerSegment;
  unsigned int errorSegment;
  unsigned int exportTable;
  unsigned int importTable;
  unsigned int authenticationTable; //Encrypt it with GPG.
  unsigned int loaderTable;
};

答案 1 :(得分:1)

内存中的C ++类型字段不是必须连续的,有一些规则控制编译器何时在字段之间引入填充。

不同类型的字段通常在预定边界处对齐。在这种情况下 char的大小为1个字节,它是1对齐的,int的大小为4个字节,并且是4对齐的。 (您可以找到有关此here)的更多详细信息。

所以你的结构在内存中看起来像这样

0: signature[0]
1: signature[1]
2: architecture
3: PADDING!
4: first byte of length
...

由于填充,你得到的长度字段值不正确。

我建议你不要将原始数据读入内存并将其重新解释为某种类型。这可能非常危险,因为您几乎无法确定编译器如何在内存中对齐您的类型。

更安全的解决方案是创建辅助函数,如

_lnHeader32 readLnHeader32(const char* binary);

并在此函数中从文件中读取的二进制流中逐个读取_lnHeader32结构的字段。