Question

假设我有一个C ++结构：

struct Clazz {
    uint8_t a : 2;
    uint8_t b : 6;
};

我希望能够尽快交换此类的元素。最好只是致电std::swap(cl1, cl2)或专业化，以及如何？这有用吗？

Clazz:swap(Clazz& other) {
    std::swap(a, other.a);
    std::swap(b, other.b); // how to make C++ swap the whole uint8_t value at once?
}

Answer 1

我希望充分的编译器能够在没有你提供的交换的情况下正确地执行它，所以你应该测量什么会更快但你可能想尝试的一件事就是转换为uint8_t

void Clazz::swap(Clazz& other) {
    std::swap(reinterpret_cast<uint8_t&>(*this), reinterpret_cast<uint8_t&>(other))
}

或者只是使用memcpy重写交换

好的，现在就去编译代码吧。使用gcc5.2和-O2优化级别进行测试：

void test1(Clazz& a, Clazz& b) {
  a.swap(b);
}

void test2(Clazz& a, Clazz& b) {
  std::swap(a, b);
}

生成的代码：

test1(Clazz&, Clazz&):
    movzbl  (%rdi), %eax
    movzbl  (%rsi), %edx
    movb    %dl, (%rdi)
    movb    %al, (%rsi)
    ret
test2(Clazz&, Clazz&):
    movzbl  (%rdi), %eax
    movzbl  (%rsi), %edx
    movb    %dl, (%rdi)
    movb    %al, (%rsi)
    ret

DEMO

Answer 2

假设您有以下代码：

struct Clazz {
    uint8_t a : 2;
    uint8_t b : 6;
};

Clazz c1,c2;
c1.a = 1; c1.b = 61;
c2.a = 3; c2.b = 63;

您可以在班级中执行std::swap(c1, c2)，但没有额外的交换功能，事情将按预期进行交换。请注意，结果可能会因实现而异，粗略的性能测试（based on another answer）表明自定义交换（在此代码的情况下）可能更为理想。感谢用户Nax reminder of timing，我已经编辑了下面的测试代码，以考虑更大的设置。

如果以下测试代码在g++ 4.2.1未经优化编译，则std::swap循环的粗略平均值变为0.00523261 us，而Clazz:swap粗略平均值变为0.00996584 us {1}}。这意味着std::swap实施比reinterpret_cast对象更快Clazz对象uint8_t更快，但是，当您启用最大优化时-O3 1}}），时间更加明显，std::swap循环进入0.00102523 us而Clazz::swap循环进入0.000739868 us，速度显着提高std::swap方法。

此外，如果您的Clazz对象变得比默认构造函数更复杂，那么您必须包含一个复制构造函数，std::swap的时间几乎会增加一倍实现将在创建临时时使用对象的复制构造函数。向Clazz(const Clazz& cp) : a(cp.a), b(cp.b) {}对象添加空的默认构造函数和复制构造函数（即Clazz）会将std::swap次更改为以下内容：no optimizations ~ 0.0104058 us，-O3 ~ 0.00241034 us;一倍的时间，而Clazz::swap方法保持一致。

对此，如果您希望std::swap方法调用自定义Clazz::swap方法，有几种方法可以实现此目的，因为std::swap方法不会自动调用任何定义的swap方法的类（std::swap不会调用a.swap(b)）。

您可以直接使用专门的类重载std::swap方法，例如：

namespace std {
    void swap(Clazz& a, Clazz& b)
    {
        std::swap(reinterpret_cast<uint8_t&>(a), reinterpret_cast<uint8_t&>(b));
        /* or a.swap(b) if your Clazz has private types that need to be
        accounted for and you provide a public swap method */
    }
}

但是，这样做意味着argument-dependant lookup无法找到此专精，因此调用swap(a, b)与std::swap(a, b)不同。

要满足ADL，您只需将swap方法包含在Clazz对象的同一名称空间内（无论是全局名称空间还是命名/匿名名称空间），例如：

using std::swap;

namespace ClazzNS {
    struct Clazz {
        uint8_t a : 2;
        uint8_t b : 6;
    };

    void swap(Clazz& a, Clazz& b)
    {
        std::swap(reinterpret_cast<uint8_t&>(a), reinterpret_cast<uint8_t&>(b));
        /* or a.swap(b) if your Clazz has private types that need to be
        accounted for and you provide a public swap method */
    }
}

int main()
{
    ClazzNS::Clazz c1,c2;
    c1.a = 1; c1.b = 61;
    c2.a = 3; c2.b = 63;
    swap(c1, c2); // calls ClazzNS::swap over std::swap
}

这样您就可以调用swap(a, b)而无需明确命名空间。

如果你是迂腐的，你可以将它们混合在一起以获得以下内容：

namespace ClazzNS {
    struct Clazz {
        uint8_t a : 2;
        uint8_t b : 6;

        void swap(ClazzNS::Clazz& other)
        {
            std::swap(reinterpret_cast<uint8_t&>(*this), reinterpret_cast<uint8_t&>(other));
        }
    };

    void swap(ClazzNS::Clazz& a, ClazzNS::Clazz& b)
    {
        a.swap(b);
    }
}

namespace std {
    void swap(ClazzNS::Clazz& a, ClazzNS::Clazz& b)
    {
        ClazzNS::swap(a, b);
        /*
        you could also directly just call
        std::swap(reinterpret_cast<uint8_t&>(a), reinterpret_cast<uint8_t&>(b))
        or a.swap(b) if you wanted to avoid
        multiple function calls */
    }
}

int main()
{
    ClazzNS::Clazz c1,c2;
    c1.a = 1; c1.b = 61;
    c2.a = 3; c2.b = 63;
    c1.swap(c2); // calls ClazzNS::Clazz::swap 
    swap(c1, c2); // calls ClazzNS::swap
    std::swap(c1, c2); // calls the overloaded std::swap
}

您的结果可能会有所不同，最终取决于您如何实施互换方法，但最好自己测试一下;我希望这些数字可以提供帮助。

C ++测试代码：

#include <iostream>
#include <algorithm>
#include <string>
#include <ctime>
#include <csignal>

struct Clazz {
    uint8_t a : 2;
    uint8_t b : 6;

    void swap(Clazz& other)
    {
        std::swap(reinterpret_cast<uint8_t&>(*this), reinterpret_cast<uint8_t&>(other));
    }
};

static double elapsed_us(struct timespec init, struct timespec end)
{
    return ((end.tv_sec - init.tv_sec) * 1000000) + (static_cast<double>((end.tv_nsec - init.tv_nsec)) / 1000);
}

static void printall(const Clazz& c1, const Clazz& c2)
{
    std::cout << "c1.a:" << static_cast<unsigned int>(c1.a) << ", c1.b:" << static_cast<unsigned int>(c1.b) << std::endl;
    std::cout << "c2.a:" << static_cast<unsigned int>(c2.a) << ", c2.b:" << static_cast<unsigned int>(c2.b) << std::endl;
}

int main() {
    int max_cnt = 100000001;
    struct timespec init, end;
    Clazz c1, c2;
    c1.a = 1; c1.b = 61;
    c2.a = 3; c2.b = 63;
    printall(c1, c2);

    std::cout << "std::swap" << std::endl;
    std::swap(c1, c2); // to show they actually swap
    printall(c1, c2);

    std::cout << "c1.swap(c2)" << std::endl;
    c1.swap(c2); // to show again they actually swap
    printall(c1, c2);

    std::cout << "std::swap loop" << std::endl;
    clock_gettime(CLOCK_MONOTONIC, &init);
    for (int i = 0; i < max_cnt; ++i) {
        std::swap(c1, c2);
    }
    clock_gettime(CLOCK_MONOTONIC, &end);
    printall(c1, c2);
    // rough estimate of timing / divide by iterations
    std::cout << "std::swap avg. us = " << (elapsed_us(init, end) / max_cnt) << " us" << std::endl;

    std::cout << "Clazz::swap loop" << std::endl;
    clock_gettime(CLOCK_MONOTONIC, &init);
    for (int i = 0; i < max_cnt; ++i) {
        c1.swap(c2);
    }
    clock_gettime(CLOCK_MONOTONIC, &end);
    printall(c1, c2);
    // rough estimate of timing / divide by iterations
    std::cout << "Clazz:swap avg. us = " << (elapsed_us(init, end) / max_cnt) << " us" << std::endl;

    return 0;
}

希望可以提供帮助。

如何使用bitfield成员专门化`swap`？

2 个答案:

C ++测试代码：