Question

我是c ++（和SO）的新手，很抱歉，如果这很明显。

我已经开始在我的代码中使用临时数组来减少重复并使更容易对多个对象做同样的事情。所以而不是：

MyObject obj1, obj2, obj3, obj4;

obj1.doSomming(arg);
obj2.doSomming(arg);
obj3.doSomming(arg);
obj4.doSomming(arg);

我在做：

MyObject obj1, obj2, obj3, obj4;
MyObject* objs[] = {&obj1, &obj2, &obj3, &obj4};

for (int i = 0; i !=4; ++i)
    objs[i]->doSomming(arg);

这对性能有害吗？比如，它会导致不必要的内存分配吗？这是好习惯吗？感谢。

Answer 1

一般来说，你不应该担心这个级别的表现。通常，最终出现性能问题的事情与您的预期完全不同，特别是如果您没有很多性能优化经验。

您应该始终考虑首先编写清晰的代码，如果性能很重要，那么您应该在算法术语中考虑它（即big-O）。然后，您应该衡量绩效，并将指导用于优化工作。

现在，如果您避开中间数组并只使用数组作为原始对象，则可以使代码更清晰，更直接：

MyObject obj[4];

for (int i = 0; i !=4; ++i)
  objs[i].doSomming(arg);

但不是，优化编译器通常应该没有问题。

例如，如果我接受代码：

struct MyObject {
    void doSomming() {
        std::printf("Hello\n");
    }
};

void foo1() {
    MyObject obj1, obj2, obj3, obj4;

    obj1.doSomming();
    obj2.doSomming();
    obj3.doSomming();
    obj4.doSomming();
}

void foo2() {
    MyObject obj1, obj2, obj3, obj4;
    MyObject* objs[] = {&obj1, &obj2, &obj3, &obj4};

    for (int i = 0; i !=4; ++i)
        objs[i]->doSomming();
}

void foo3() {
    MyObject obj[4];

    for (int i = 0; i !=4; ++i)
        obj[i].doSomming();
}

并生成LLVM IR（因为它比实际装配更紧凑），我得到以下-O3。

define void @_Z4foo1v() nounwind uwtable ssp {
entry:
  %puts.i = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  %puts.i1 = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  %puts.i2 = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  %puts.i3 = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  ret void
}

define void @_Z4foo2v() nounwind uwtable ssp {
entry:
  %puts.i = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  %puts.i.1 = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  %puts.i.2 = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  %puts.i.3 = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  ret void
}

define void @_Z4foo3v() nounwind uwtable ssp {
entry:
  %puts.i = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  %puts.i.1 = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  %puts.i.2 = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  %puts.i.3 = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  ret void
}

在-O3，循环展开，代码与原始版本相同。使用-Os循环不会展开，但是指针间接甚至数组都会消失，因为在内联后它们不再需要：

define void @_Z4foo2v() nounwind uwtable optsize ssp {
entry:
  br label %for.body

for.body:                                         ; preds = %entry, %for.body
  %i.05 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
  %puts.i = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  %inc = add nsw i32 %i.05, 1
  %cmp = icmp eq i32 %inc, 4
  br i1 %cmp, label %for.end, label %for.body

for.end:                                          ; preds = %for.body
  ret void
}

define void @_Z4foo3v() nounwind uwtable optsize ssp {
entry:
  br label %for.body

for.body:                                         ; preds = %entry, %for.body
  %i.03 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
  %puts.i = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  %inc = add nsw i32 %i.03, 1
  %cmp = icmp eq i32 %inc, 4
  br i1 %cmp, label %for.end, label %for.body

for.end:                                          ; preds = %for.body
  ret void
}

使用临时数组来减少代码效率低下？

1 个答案: