Question

在我的项目中，我编写了一个解压缩QString的函数，该函数使用我在单独函数中编写的非常基本的压缩格式进行压缩。但是经过一些测试后，我发现这个功能导致大规模减速，因为它运行在巨大的QString上并被调用超过2900次。

我一直在尝试更改此功能，以便它运行得更快。我已经涉足QStringRef而没有很好的结果（我可能做错了）。 QByteArrays和QByteRefs很难处理和检查值（imo）。

我真的需要一些优化此功能的帮助，因此它运行 FAST！尽快！我相信.mid的不断调用会减慢速度，但我只是不知道其他任何读/写字节的方法。

编辑：更好的问题，当涉及到减压功能时，我是否缺少常见的做法？我稍后在同一个程序中使用zlib，它比我在下面写的这个简单函数压缩得更快。这是为什么？ zlib做什么不同？

感谢您提前的时间。：）

以下是非常小的压缩 QString的样子：

//Compressed
//This QString is just a hexadecimal representation of a QByteArray
//
QString com("010203ff0504ff0a05ff00ff01ff02ff0306);

而且，解压缩后的QString也是如此：

//Decompressed
QString decom("0102030404040404040505050505050505050505ffffffffffff06060606);

很抱歉，如果你不理解这种格式......那就没那么重要了。也许这会有所帮助：

-a byte with "ff" tells us we're about to decompress
-the byte after "ff" is the number of times to repeat the NEXT byte + 1
-UNLESS that number is 0, 1, or 2, then "ff" is the value to be repeated

Examples:
-"010203" decompressed is "010203"

-"ff0401" decompressed is "0101010101"

-"ff02" decompressed is "ffffff"

这是我写的解压缩功能：

int HexToIntS(QString num_hex)  //converts the byte to a number
{
    uint num_uint;
    bool ok;
    num_uint = num_hex.toUInt(&ok,16);
    return (int)num_uint;
}
void Decompress(QString com, QString &decom)
{
    QString c;                 //current byte
    QString n;                 //new/next byte
    int bytePos(0);            //current position in QString
    int byteRepeat;            //number of times to repeat byte n

    c = com.mid(bytePos, 2);   //get first byte (01)
    decom.clear();             //clear decom just in case it had values prior

    do
    {
        bytePos = bytePos + 2;      //move the current position to the next byte
        if(c == "ff")               //is decompression happening?
        {
            c = com.mid(bytePos, 2);   //current byte is now the "next" byte
            byteRepeat = HexToIntS(c); //c tells us how many times the NEXT byte needs to be repeated

            if(byteRepeat <= 2)        //if c's value is <= 2... then ff is the value
            {
                n = "ff";              //new byte is just ff
                bytePos = bytePos + 2; //update the current position
            }
            else                       //if not, then c is the number of times the NEXT byte should be appended
            {
                n = com.mid(bytePos + 2, 2); //new byte is the NEXT byte
                bytePos = bytePos + 4;       //update the current position
            }

            for(int j = 0; j<=byteRepeat; j++)//append n the correct number of times
                decom.append(n);
        }
        else                   //guess we're not decompressing, so just append c
            decom.append(c);
        c = com.mid(bytePos, 2);   //get the new current byte
    }while(bytePos < com.length());  //stop when all bytes were read
}

基于您的评论的当前优化功能:(仅在调试模式下快5％-10％）

void Decompress2(const QString com, QString &decom)
{
    QStringRef c;
    QString n;
    int bytePos(0);
    int byteRepeat;

    c = com.midRef(bytePos, 2);
    decom.clear();

    do
    {
        bytePos = bytePos + 2;
        if(c == "ff")
        {
            c = com.midRef(bytePos, 2);
            byteRepeat = c.toString().toInt(0,16);

            if(byteRepeat <= 2)
            {
                n = "ff";
                bytePos = bytePos + 2;
            }
            else
            {
                n = com.mid(bytePos + 2, 2);
                bytePos = bytePos + 4;
            }

            for(int j = 0; j<=byteRepeat; j++)
                decom.append(n);
        }
        else
            decom.append(c);
        c = com.midRef(bytePos, 2);
    }while(bytePos < com.length());
}

Answer 1

您不应该将字节数组视为字符串。那是愚蠢的，正如你所说，死得很慢。请改用原始字节值并对其进行操作。

我知道我不应该为其他人编写代码，但我绝对没有更好的事情要做，所以这里是直接的C ++。我知道你正在使用Qt而且我相当肯定以下大多数代码在Qt的ByteArray方面都有一些相同的东西，但是那些你可以理解的东西，如果纯粹的话C ++不是一个选项。

#include <vector>
#include <cstdint>
#include <iomanip>
#include <iostream>

std::vector<std::uint8_t> decompress(const std::vector<std::uint8_t>& com)
{
  std::vector<std::uint8_t> decom;
  decom.reserve(com.size()); // a conservative estimate of the required size

  for(auto it = begin(com); it != end(com); ++it)
  {
    if(*it == 0xff)
    {
      ++it;
      if(it != end(com))
      {
        std::uint8_t number_of_repeats = *it;
        if(number_of_repeats <= 2)
        {
          std::fill_n(std::back_inserter(decom), number_of_repeats, 0xff);
          continue;
        }
        else
        {
          ++it;
          if(it != end(com))
            std::fill_n(std::back_inserter(decom), number_of_repeats, *it);
          else
            throw 42; // handle error in some way
        }
      }
      else 
        throw 42; // handle error in some way
    }
    else
      decom.push_back(*it);
  }
  return decom;
}
int main()
{
  std::vector<std::uint8_t> com{0x01, 0x02, 0x03, 0xff, 0x05, 0x04, 0xff, 0x0a, 0x05, 0xff, 0x00, 0xff, 0x01, 0xff, 0x02, 0xff, 0x03, 0x06};


  for(const auto& value : com)
    std::cout << std::hex << std::setfill('0') << std::setw(2) << static_cast<unsigned short>(value) << ' ';
  std::cout << '\n';
  auto result = decompress(com);

  for(const auto& value : result)
    std::cout << std::hex << std::setfill('0') << std::setw(2) << static_cast<unsigned short>(value) << ' ';
}

Live demo here。对于此代码的正确性，效率或其他可用性，我不承担任何责任。它写在五分钟之内。

请注意，我相信您的长示例中的解压缩字符串是错误的。根据你的规则，它应该是

01 02 03 04 04 04 04 04 05 05 05 05 05 05 05 05 05 05 ff ff ff 06 06 06

从背面开始06重复三次，然后2次ff，然后1次ff，然后0次ff，然后是其余的。

优化具有QString操作的循环

1 个答案: