Question

我有一个包含大量十六进制字符串的文件。这是前几行：

0000038f
0000111d
0000111d
03030303
//Goes on for a long time

我有一个用于保存该数据的大型结构：

typedef struct
{
  unsigned int field1: 5;
  unsigned int field2: 11;
  unsigned int field3: 16;
  //Goes on for a long time
}calibration;

我想要做的是阅读上面的字符串并将其存储在结构中。我可以假设输入有效（在我得到之前已经过验证）。

我已经有一个循环读取文件并将整个项目放在一个字符串中：

std::string line = "";
std::string hexText = "";
while(!std::getline(readFile, line))
{
  hexText += line;
}
//Convert string into calibration
//Convert string into long int
long int hexInt = strtol(hexText.c_str(), NULL, 16);
//Here I get stuck: How to get from long int to calibration...?

Answer 1

如何从long int到校准...？

Cameron的回答很好，可能也就是你想要的。

我在这里提供另一种（可能没有那么不同）的方法。

注1：您的文件输入需要重新工作。我会建议

a）使用getline（）一次一行地获取一个字符串

b）将一个条目转换为uint32_t（我会使用stringstream而不是atol）

了解如何检测无效输入并从中恢复，然后你可以将a）和b）组合成一步

c）然后在你的结构中安装uint32_t 以下提供可能提供见解。

注2：我在比特领域工作了很多年，对他们产生了厌恶。我从未发现它们比替代品更方便。

我更喜欢的替代方案是位掩码和场移位。

就我们从您的问题陈述中可以看出，您的问题似乎不需要位字段（Cameron的答案说明了这一点）。

注3：并非所有编译器都会为您打包这些位字段。

我使用的最后一个编译器需要所谓的＆＃34; pragma＆＃34;。

ubuntu上的G ++ 4.8似乎很好地填充了字节（即不需要编译指示）

原始代码的sizeof（校准）为4 ...即打包。

另一个问题是，当您更改选项或升级编译器或更改编译器时，打包可能会意外更改。

我的团队的解决方法是始终在CTOR中对结构大小和几个字节偏移进行断言。

注4：我没有说明使用＆＃39; union＆＃39;将uint32_t数组对齐校准结构。

这可能比重新解释演员方法更受欢迎。检查你的要求，团队领导，教授。

无论如何，根据您最初的努力，请考虑以下对结构校准的补充：

  typedef struct
  {
     uint32_t field1 :  5;
     uint32_t field2 : 11;
     uint32_t field3 : 16;
     //Goes on for a long time

     // I made up these next 2 fields for illustration
     uint32_t field4 :  8;
     uint32_t field5 : 24;

     // ... add more fields here

     // something typically done by ctor or used by ctor
     void clear() { field1 = 0; field2 = 0; field3 = 0; field4 = 0; field5 = 0; }

     void show123(const char* lbl=0) {
        if(0 == lbl) lbl = " ";
        std::cout << std::setw(16) << lbl;
        std::cout << "     " << std::setw(5) << std::hex << field3 << std::dec 
                  << "     " << std::setw(5) << std::hex << field2 << std::dec 
                  << "     " << std::setw(5) << std::hex << field1 << std::dec 
                  << "     0x" << std::hex << std::setfill('0') << std::setw(8) 
                  << *(reinterpret_cast<uint32_t*>(this))
                  << "    => "  << std::dec << std::setfill(' ') 
                  << *(reinterpret_cast<uint32_t*>(this))
                  << std::endl;
     } // show
     // I did not create show456() ... 

     // 1st uint32_t: set new val, return previous
     uint32_t set123(uint32_t nxtVal) {
        uint32_t* myVal = reinterpret_cast<uint32_t*>(this);
        uint32_t prevVal = myVal[0];
        myVal[0] = nxtVal;
        return (prevVal);
     }

     // return current value of the combined field1, field2 field3
     uint32_t get123(void) {
        uint32_t* myVal = reinterpret_cast<uint32_t*>(this);
        return  (myVal[0]);
     }

     // 2nd uint32_t: set new val, return previous
     uint32_t set45(uint32_t nxtVal) {
        uint32_t* myVal = reinterpret_cast<uint32_t*>(this);
        uint32_t prevVal = myVal[1];
        myVal[1] = nxtVal;
        return (prevVal);
     }

     // return current value of the combined field4, field5
     uint32_t get45(void) {
        uint32_t* myVal = reinterpret_cast<uint32_t*>(this);
        return  (myVal[1]);
     }


     // guess that next 4 fields fill 32 bits
     uint32_t get6789(void) {
        uint32_t* myVal = reinterpret_cast<uint32_t*>(this);
        return  (myVal[2]);
     }
     // ... tedious expansion

  } calibration;

以下是一些用于说明用法的测试代码：

  uint32_t t125()
  {
     const char* lbl = 
        "\n                    16 bits   11 bits    5 bits      hex         => dec";

     calibration cal;
     cal.clear();
     std::cout << lbl << std::endl;
     cal.show123();

     cal.field1 = 1;
     cal.show123("field1 =     1");
     cal.clear();
     cal.field1 = 31;
     cal.show123("field1 =    31");
     cal.clear();

     cal.field2 = 1;
     cal.show123("field2 =     1");
     cal.clear();
     cal.field2 = (2047 & 0x07ff);
     cal.show123("field2 =  2047");
     cal.clear();

     cal.field3 = 1;
     cal.show123("field3 =     1");
     cal.clear();
     cal.field3 = (65535 & 0x0ffff);
     cal.show123("field3 = 65535");

     cal.set123 (0xABCD6E17);
     cal.show123 ("set123(0x...)");

     cal.set123 (0xffffffff);
     cal.show123 ("set123(0x...)");

     cal.set123 (0x0);
     cal.show123 ("set123(0x...)");

     std::cout << "\n";

     cal.clear();
     std::cout << "get123(): " << cal.get123() << std::endl;
     std::cout << " get45(): " << cal.get45() << std::endl;

     // values from your file:
     cal.set123 (0x0000038f);
     cal.set45  (0x0000111d);

     std::cout << "get123(): " << "0x"  << std::hex << std::setfill('0') 
               << std::setw(8) << cal.get123() << std::endl;
     std::cout << " get45(): " << "0x"  << std::hex << std::setfill('0') 
               << std::setw(8) <<  cal.get45() << std::endl;

     // cal.set6789 (0x03030303);
     // std::cout << "get6789(): " << cal.get6789() << std::endl;

     // ...

     return(0);
  }

测试代码输出：

                    16 bits   11 bits    5 bits      hex         => dec
                         0         0         0     0x00000000    => 0
  field1 =     1         0         0         1     0x00000001    => 1
  field1 =    31         0         0        1f     0x0000001f    => 31
  field2 =     1         0         1         0     0x00000020    => 32
  field2 =  2047         0       7ff         0     0x0000ffe0    => 65,504
  field3 =     1         1         0         0     0x00010000    => 65,536
  field3 = 65535      ffff         0         0     0xffff0000    => 4,294,901,760
   set123(0x...)      abcd       370        17     0xabcd6e17    => 2,882,366,999
   set123(0x...)      ffff       7ff        1f     0xffffffff    => 4,294,967,295
   set123(0x...)         0         0         0     0x00000000    => 0

get123(): 0
 get45(): 0
get123(): 0x0000038f
 get45(): 0x0000111d

此代码的目标是帮助您了解位字段如何通过数据的msbyte映射到lsbyte。

Answer 2

如果您完全关心效率，请不要将整个内容读入字符串然后进行转换。只需一次阅读一个单词，然后将其转换。你的循环应该类似于：

calibration c;
uint32_t* dest = reinterpret_cast<uint32_t*>(&c);
while (true) {
    char hexText[8];
    // TODO: Attempt to read 8 bytes from file and then skip whitespace
    // TODO: Break out of the loop on EOF

    std::uint32_t hexValue = 0;    // TODO: Convert hex to dword

    // Assumes the structure padding & packing matches the dump version's
    // Assumes the structure size is exactly a multiple of 32-bytes (w/ padding)
    static_assert(sizeof(calibration) % 4 == 0);
    assert(dest - &c < sizeof(calibration) && "Too much data");
    *dest++ = hexValue;
}
assert(dest - &c == sizeof(calibration) && "Too little data");

将8个十六进制字符转换为实际的4字节int是一个很好的练习，并且在其他地方很好地覆盖了，所以我把它遗漏了（连同文件读取，同样很好地覆盖）。

请注意循环中的两个假设：第一个在运行时或编译时无法检查，必须事先达成一致或者必须进行额外的工作才能正确地序列化结构（处理结构包装）和填充等）。最后一个至少可以在编译时使用static_assert进行检查。

此外，在转换十六进制字符串时，必须注意确保文件中十六进制字节的字节顺序与执行程序的体系结构的字节顺序相匹配。这将取决于十六进制是否首先以特定的字节顺序写入（在这种情况下，您可以很容易地将它从知道字节序转换为当前架构的字节序），或者它是否依赖于体系结构（在这种情况下，您有别无选择，只能假设字节顺序与您当前的架构相同。）

将十六进制字符串转换为结构

2 个答案: