Question

在内存中复制已知结构时，您更喜欢使用memcpy还是取消引用？为什么？具体来说，在以下代码中：

#include <stdio.h>
#include <string.h>

typedef struct {
    int foo;
    int bar;
} compound;

void copy_using_memcpy(compound *pto, compound *pfrom)
{
    memcpy(pto, pfrom, sizeof(compound));
}
void copy_using_deref(compound *pto, compound *pfrom)
{
    *pto = *pfrom;
}

int main(int argc, const char *argv[])
{
    compound a = { 1, 2 };
    compound b = { 0 };
    compound *pa = &a;
    compound *pb = &b;

    // method 1
    copy_using_memcpy(pb, pa);
    // method 2
    copy_using_deref(pb, pa);
    printf("%d %d\n", b.foo, b.bar);

    return 0;
}

您更喜欢方法1还是方法2？我查看了gcc生成的程序集，似乎方法2使用的指令少于方法1.这是否意味着方法2在这种情况下更可取？谢谢。

Answer 1

我无法想到在复制结构时使用memcpy()而不是赋值的任何好理由（只要您不需要执行deep copy或涉及{{{ 3}}或灵活的数组成员，在这种情况下都不适用。）

它们具有完全相同的语义，并且赋值（a）可能为编译器提供更多优化机会，并且（b）降低错误大小的风险。

某些非常的旧C编译器可能不支持结构分配，但这不再是一个重要问题。

（还有其他理由喜欢在C ++中进行作业，但你的问题是关于C.）

顺便提一下，

中的括号

(*pto) = (*pfrom);

是不必要的;一元*紧紧地束缚着这个：

*pto = *pfrom;

对于大多数读者来说都是正确且充分明确的。

Answer 2

出于与您提到的完全相同的原因，我更喜欢方法2（解除引用方法）。 Memcpy执行逐字节复制并且具有函数调用的开销，而解除引用仅执行复制，并且没有额外的开销。

取消引用和分配也更具可读性（特别是当您省略多余的括号时：

*dest = *src;

）

Answer 3

我试图用Google的benchmark来运行它：

#include <benchmark/benchmark.h>
#include <stdio.h>
#include <string.h>

typedef struct {
    int foo;
    int bar;
    int a;
    int b;
    int c;
    int d;
    int e;
    int f;
    int g;
} compound;

static void copy_using_memcpy(benchmark::State& state) {
    compound a = {0, 0, 0, 0, 0, 0, 0, 0, 0};
    compound b = {0, 0, 0, 0, 0, 0, 0, 0, 0};
    compound* pa = &a;
    compound* pb = &b;
    for (auto _ : state) memcpy(pa, pb, sizeof(compound));
}
static void copy_using_deref(benchmark::State& state) {
    compound a = {0, 0, 0, 0, 0, 0, 0, 0, 0};
    compound b = {0, 0, 0, 0, 0, 0, 0, 0, 0};
    compound* pa = &a;
    compound* pb = &b;
    for (auto _ : state) *pa = *pb;
}

BENCHMARK(copy_using_memcpy);
BENCHMARK(copy_using_deref);

BENCHMARK_MAIN();

结果类似于：

> g++ benchmark.cc -lbenchmark -lpthread && ./a.out
2020-11-20T20:12:12+08:00
Running ./a.out
Run on (16 X 1796.56 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x8)
  L1 Instruction 32 KiB (x8)
  L2 Unified 512 KiB (x8)
  L3 Unified 4096 KiB (x1)
Load Average: 0.29, 0.15, 0.10
------------------------------------------------------------
Benchmark                  Time             CPU   Iterations
------------------------------------------------------------
copy_using_memcpy       2.44 ns         2.44 ns    282774223
copy_using_deref        1.77 ns         1.77 ns    389126375

在原始示例中，只有两个字段，时间大致相同。

memcpy和dereference之间的偏好

3 个答案: