如果我需要将零分配给一大块内存。如果架构是32位,那么long long
(在特定架构上是8个字节)的赋值比int
(4个字节)的赋值更高效,还是等于两个int赋值?并且int
的赋值将更有效,然后使用char
为同一块内存进行赋值,因为如果我使用char
而不是{{1},我将需要循环4次}}
答案 0 :(得分:7)
为什么不使用memset()
?
http://www.elook.org/programming/c/memset.html
(来自上方网站)
语法:
#include <string.h>
void *memset( void *buffer, int ch, size_t count );
说明
函数memset()将ch复制到缓冲区的第一个计数字符,并返回缓冲区。 memset()对于将一部分内存初始化为某个值非常有用。例如,这个命令:
memset( the_array, '\0', sizeof(the_array) );
答案 1 :(得分:2)
对于你的问题,答案是肯定的,是的,如果编译器是智能的/优化的。
有趣的是,在有SSE的机器上,我们可以使用128位块:)这只是我的意见,总是试着强调可读性与简洁性平衡所以是的...我倾向于使用{{1} },它并不总是完美的,可能不是最快但它告诉维护代码的人“嘿我正在初始化或设置这个数组”
无论如何这里有一些测试代码,如果需要任何更正请告诉我。
memset
这里有一些测试:
#include <time.h>
#include <xmmintrin.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define NUMBER_OF_VALUES 33554432
int main()
{
int *values;
int result = posix_memalign((void *)&values, 16, NUMBER_OF_VALUES * sizeof(int));
if (result)
{
printf("Failed to mem allocate \n");
exit(-1);
}
clock_t start, end;
int *temp = values, total = NUMBER_OF_VALUES;
while (total--)
*temp++ = 0;
start = clock();
memset(values, 0, sizeof(int) * NUMBER_OF_VALUES);
end = clock();
printf("memset time %f\n", ((double) (end - start)) / CLOCKS_PER_SEC);
start = clock();
{
int index = 0, total = NUMBER_OF_VALUES * sizeof(int);
char *temp = (char *)values;
for(; index < total; index++)
temp[index] = 0;
}
end = clock();
printf("char-wise for-loop array indices time %f\n", ((double) (end - start)) / CLOCKS_PER_SEC);
start = clock();
{
int index = 0, *temp = values, total = NUMBER_OF_VALUES;
for (; index < total; index++)
temp[index] = 0;
}
end = clock();
printf("int-wise for-loop array indices time %f\n", ((double) (end - start)) / CLOCKS_PER_SEC);
start = clock();
{
int index = 0, total = NUMBER_OF_VALUES/2;
long long int *temp = (long long int *)values;
for (; index < total; index++)
temp[index] = 0;
}
end = clock();
printf("long-long-int-wise for-loop array indices time %f\n", ((double) (end - start)) / CLOCKS_PER_SEC);
start = clock();
{
int index = 0, total = NUMBER_OF_VALUES/4;
__m128i zero = _mm_setzero_si128();
__m128i *temp = (__m128i *)values;
for (; index < total; index++)
temp[index] = zero;
}
end = clock();
printf("SSE-wise for-loop array indices time %f\n", ((double) (end - start)) / CLOCKS_PER_SEC);
start = clock();
{
char *temp = (char *)values;
int total = NUMBER_OF_VALUES * sizeof(int);
while (total--)
*temp++ = 0;
}
end = clock();
printf("char-wise while-loop pointer arithmetic time %f\n", ((double) (end - start)) / CLOCKS_PER_SEC);
start = clock();
{
int *temp = values, total = NUMBER_OF_VALUES;
while (total--)
*temp++ = 0;
}
end = clock();
printf("int-wise while-loop pointer arithmetic time %f\n", ((double) (end - start)) / CLOCKS_PER_SEC);
start = clock();
{
long long int *temp = (long long int *)values;
int total = NUMBER_OF_VALUES/2;
while (total--)
*temp++ = 0;
}
end = clock();
printf("long-ling-int-wise while-loop pointer arithmetic time %f\n", ((double) (end - start)) / CLOCKS_PER_SEC);
start = clock();
{
__m128i zero = _mm_setzero_si128();
__m128i *temp = (__m128i *)values;
int total = NUMBER_OF_VALUES/4;
while (total--)
*temp++ = zero;
}
end = clock();
printf("SSE-wise while-loop pointer arithmetic time %f\n", ((double) (end - start)) / CLOCKS_PER_SEC);
free(values);
return 0;
}
我的信息:
$ gcc time.c
$ ./a.out
memset time 0.025350
char-wise for-loop array indices time 0.334508
int-wise for-loop array indices time 0.089259
long-long-int-wise for-loop array indices time 0.046997
SSE-wise for-loop array indices time 0.028812
char-wise while-loop pointer arithmetic time 0.271187
int-wise while-loop pointer arithmetic time 0.072802
long-ling-int-wise while-loop pointer arithmetic time 0.039587
SSE-wise while-loop pointer arithmetic time 0.030788
$ gcc -O2 -Wall time.c
MacBookPro:~ samyvilar$ ./a.out
memset time 0.025129
char-wise for-loop array indices time 0.084930
int-wise for-loop array indices time 0.025263
long-long-int-wise for-loop array indices time 0.028245
SSE-wise for-loop array indices time 0.025909
char-wise while-loop pointer arithmetic time 0.084485
int-wise while-loop pointer arithmetic time 0.025277
long-ling-int-wise while-loop pointer arithmetic time 0.028187
SSE-wise while-loop pointer arithmetic time 0.025823
$ gcc --version
i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3)
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ uname -a
Darwin MacBookPro 10.8.0 Darwin Kernel Version 10.8.0: Tue Jun 7 16:33:36 PDT 2011; root:xnu-1504.15.3~1/RELEASE_I386 i386
可能使用内联汇编进行了相当优化,但这又因编译器而异......
memset
某些时间开始收敛时,gcc似乎正在非常积极地进行优化我想我应该看看这个程序集。
如果你是好奇的,只需致电-O2
,大会在gcc -S -msse2 -O2 -Wall time.c
答案 2 :(得分:0)
始终避免在更高级别的编程语言中进行额外的迭代。如果只是在int上迭代一次,而不是遍历其字节,那么你的代码会更有效。
答案 3 :(得分:0)
分配优化在大多数体系结构上完成,因此它们与字大小对齐,对于32位x86,字大小为4字节。因此,分配相同大小的内存并不重要(1MB长的memset和1MB char类型的memset之间没有区别)。
答案 4 :(得分:0)
1. long long(8 bytes) vs two int(4 bytes)
- 长久以来更好。因为在分配一个8字节元素而不是两个4字节元素时性能会很好。
2. int (4 bytes) vs four char(1 bytes)
- 最好在这里使用int。
如果您只声明一个元素,那么您可以直接指定零,如下所示。
long long a;
int b;
....
a = 0; b = 0;
但如果您要声明n
元素数组,请转到memeset
函数,如下所示。
long long a[10];
int b[20];
....
memset(a, 0, sizeof(a));
memset(b, 0, sizeof(b));
如果您想在声明过程中进行初始化,则不需要memset
。
long long a = 0;
int b = 0;
或
long long a[10] = {0};
int b[20] = {0};