计算间隔的设置位数的最快方法

时间:2012-01-26 17:39:37

标签: bit-manipulation

我需要有一种快速的方法来计算位向量的索引间隔的设置位数。例如,给定10000100100011000和索引间隔[2, 5],返回值为2.索引从右侧开始为0。我有很多问题要以这种方式完成。是否单独计算位数并以最佳方式获得不同,或者是否可以进行任何预处理以降低复杂性?

2 个答案:

答案 0 :(得分:1)

这是一种实现Dave建议的方法,该建议适用于所有整数和std :: bitset。通过向左和向右移动矢量来完成范围补码的归零。您可能希望通过const&传递T如果你使用非常大的位集。在传递8位和16位整数时,您可能还需要注意隐式转换。

// primary template for POD types
template<typename T>
struct num_bits
{
    enum { value = 8 * sizeof(T) };
};

// partial specialization for std::bitset
template<size_t N>
struct num_bits< std::bitset<N> >
{
    enum { value = N };
};

// count all 1-bits in n
template<typename T>
size_t bit_count(T n)
{
    return // your favorite algorithm
}

// count all 1-bits in n in the range [First, Last)
template<typename T>
size_t bit_count(T n, size_t First, size_t Last)
{
    // class template T needs overloaded operator<< and operator>>
    return bit_count((n >> First) << (num_bits<T>::value - Last));
}

// example: count 1-bits in the range [2, 5] == [2, 6)  
size_t result = bit_count(n, 2, 6);

答案 1 :(得分:1)

假设a是较低的索引而b是从右到左计数的较高索引。假设输入数据v被归一化为64位的大小(虽然可以修改为较小的值)。

Data  10000100100011000
Index .......9876543210

C代码:

  uint64_t getSetBitsInRange(uint64_t v, uint32_t a, uint32_t b) {
      // a & b are inclusive indexes
      if( a > b) { return ~0; } //check invariant: 'a' must be lower then 'b'

      uint64_t mask, submask_1, submask_2;
      submask_1   = submask_2 = 0x01;
      submask_1 <<= a;             // set the ath bit from the left  
      submask_1 >>= 1;             // make 'a' an inclusive index
      submask_1  |= submask_1 - 1; // fill all bits after ath bit
      submask_2 <<= b;             // set the bth bit from the left  
      submask_2  |= submask_2 - 1; // fill all bits after bth bit
      mask = submask_1 ^ submask_2;
      v &= mask;   // 'v' now only has set bits in specified range

      // Now utilize any population count algorithm tuned for 64bits
      // Do some research and benchmarking find the best one for you
      // I choose this one because it is easily scalable to lower sizes
      // Note: that many chipsets have "pop-count" hardware implementations
      // Software 64bit population count algorithm (parallel bit count):

      const uint64_t m[6] = { 0x5555555555555555ULL, 0x3333333333333333ULL,
                              0x0f0f0f0f0f0f0f0fULL, 0x00ff00ff00ff00ffULL,
                              0x0000ffff0000ffffULL, 0x00000000ffffffffULL};           
      v = (v & m[0]) + ((v >> 0x01) & m[0]);
      v = (v & m[1]) + ((v >> 0x02) & m[1]);
      v = (v & m[2]) + ((v >> 0x04) & m[2]);
      v = (v & m[3]) + ((v >> 0x08) & m[3]);   //comment out this line & below to make  8bit
      v = (v & m[4]) + ((v >> 0x10) & m[4]);   //comment out this line & below to make 16bit 
      v = (v & m[5]) + ((v >> 0x20) & m[5]);   //comment out this line to make 32bit
      return (uint64_t)v;  
    }