构建的grep比Linux附带的grep慢

时间:2009-12-23 22:22:29

标签: gcc grep compiler-flags

我试图理解为什么我构建的grep比系统附带的grep慢得多,并且试图找到系统附带的grep使用的编译器选项。

操作系统版本:CentOS版本5.3(最终版) grep on system:

  Version: grep (GNU grep) 2.5.1
  Size: 88896 bytes
  ldd output: 
 libpcre.so.0 => /lib64/libpcre.so.0 (0x0000003991800000)
 libc.so.6 => /lib64/libc.so.6 (0x0000003985a00000)
 /lib64/ld-linux-x86-64.so.2 (0x0000003984a00000)

我建立的grep:

  Version: 2.5.1
  Size: 256437 bytes
  ldd output:
 libpcre.so.0 => /lib64/libpcre.so.0 (0x0000003991800000)
 libc.so.6 => /lib64/libc.so.6 (0x0000003985a00000)
 /lib64/ld-linux-x86-64.so.2 (0x0000003984a00000)

在大型列表文本文件上运行正则表达式搜索时,系统grep(330 msecs)的性能比我构建的grep(22430 msecs)快得多。

以下是我过去常常的命令..

% time src/grep ".*asa.*" large_list.txt > /dev/null
real 0m22.430s
user 0m22.291s
sys 0m0.080s

OR

% time bin/grep ".*asa.*" large_list.txt > /dev/null
real 0m0.331s
user 0m0.236s
sys 0m0.081s

系统grep显然正在使用一些提供巨大性能差异的optiomizing选项。

有些机构可以帮我解决系统grep可以构建的选项吗?

以下是我构建时其中一个源文件的编译选项。
gcc -DLIBDIR=\"/usr/local/lib\" -DHAVE_CONFIG_H -I. -I.. -I.. -I. -I../intl -g -O2 -MT xstrtol.o -MD -MP -MF .deps/xstrtol.Tpo -c -o xstrtol.o xstrtol.c

./configure的输出:

checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking build system type... x86_64-unknown-linux-gnu
checking host system type... x86_64-unknown-linux-gnu
checking for gawk... (cached) gawk
checking for gcc... gcc
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables... 
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking for style of include used by make... GNU
checking dependency style of gcc... gcc3
checking for a BSD-compatible install... /usr/bin/install -c
checking for ranlib... ranlib
checking for getconf... getconf
checking for CFLAGS value to request large file support... 
checking for LDFLAGS value to request large file support... 
checking for LIBS value to request large file support... 
checking for _FILE_OFFSET_BITS... no
checking for _LARGEFILE_SOURCE... no
checking for _LARGE_FILES... no
checking for function prototypes... yes
checking how to run the C preprocessor... gcc -E
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking for string.h... (cached) yes
checking for size_t... yes
checking for ssize_t... yes
checking for an ANSI C-conforming const... yes
checking for inttypes.h... yes
checking for unsigned long long... yes
checking for ANSI C header files... (cached) yes
checking for string.h... (cached) yes
checking for stdlib.h... (cached) yes
checking sys/param.h usability... yes
checking sys/param.h presence... yes
checking for sys/param.h... yes
checking for memory.h... (cached) yes
checking for unistd.h... (cached) yes
checking libintl.h usability... yes
checking libintl.h presence... yes
checking for libintl.h... yes
checking wctype.h usability... yes
checking wctype.h presence... yes
checking for wctype.h... yes
checking wchar.h usability... yes
checking wchar.h presence... yes
checking for wchar.h... yes
checking for dirent.h that defines DIR... yes
checking for library containing opendir... none required
checking whether stat file-mode macros are broken... no
checking for working alloca.h... yes
checking for alloca... yes
checking whether closedir returns void... no
checking for stdlib.h... (cached) yes
checking for unistd.h... (cached) yes
checking for getpagesize... yes
checking for working mmap... yes
checking for btowc... yes
checking for isascii... yes
checking for iswctype... yes
checking for mbrlen... yes
checking for memmove... yes
checking for setmode... no
checking for strerror... yes
checking for wcrtomb... yes
checking for wcscoll... yes
checking for wctype... yes
checking whether mbrtowc and mbstate_t are properly declared... yes
checking for stdlib.h... (cached) yes
checking for mbstate_t... yes
checking for memchr... yes
checking for stpcpy... yes
checking for strtoul... yes
checking for atexit... yes
checking for fnmatch... yes
checking for stdlib.h... (cached) yes
checking whether  defines strtoumax as a macro... no
checking for strtoumax... yes
checking whether strtoul is declared... yes
checking whether strtoull is declared... yes
checking for strerror in -lcposix... no
checking for inline... inline
checking for off_t... yes
checking whether we are using the GNU C Library 2.1 or newer... yes
checking argz.h usability... yes
checking argz.h presence... yes
checking for argz.h... yes
checking limits.h usability... yes
checking limits.h presence... yes
checking for limits.h... yes
checking locale.h usability... yes
checking locale.h presence... yes
checking for locale.h... yes
checking nl_types.h usability... yes
checking nl_types.h presence... yes
checking for nl_types.h... yes
checking malloc.h usability... yes
checking malloc.h presence... yes
checking for malloc.h... yes
checking stddef.h usability... yes
checking stddef.h presence... yes
checking for stddef.h... yes
checking for stdlib.h... (cached) yes
checking for string.h... (cached) yes
checking for unistd.h... (cached) yes
checking for sys/param.h... (cached) yes
checking for feof_unlocked... yes
checking for fgets_unlocked... yes
checking for getcwd... yes
checking for getegid... yes
checking for geteuid... yes
checking for getgid... yes
checking for getuid... yes
checking for mempcpy... yes
checking for munmap... yes
checking for putenv... yes
checking for setenv... yes
checking for setlocale... yes
checking for stpcpy... (cached) yes
checking for strchr... yes
checking for strcasecmp... yes
checking for strdup... yes
checking for strtoul... (cached) yes
checking for tsearch... yes
checking for __argz_count... yes
checking for __argz_stringify... yes
checking for __argz_next... yes
checking for iconv... yes
checking for iconv declaration... 
         extern size_t iconv (iconv_t cd, char * *inbuf, size_t *inbytesleft, char * *outbuf, size_t *outbytesleft);
checking for nl_langinfo and CODESET... yes
checking for LC_MESSAGES... yes
checking whether NLS is requested... yes
checking whether included gettext is requested... no
checking for libintl.h... (cached) yes
checking for GNU gettext in libc... yes
checking for dcgettext... yes
checking for msgfmt... /usr/bin/msgfmt
checking for gmsgfmt... /usr/bin/msgfmt
checking for xgettext... /usr/bin/xgettext
checking for bison... bison
checking version of bison... 2.3, ok
checking for catalogs to be installed...  af be bg ca cs da de el eo es et eu fi fr ga gl he hr hu id it ja ko ky lt nb nl pl pt pt_BR ro ru rw sk sl sr sv tr uk vi zh_TW
checking for dos file convention... no
checking host system type... (cached) x86_64-unknown-linux-gnu
checking host system type... (cached) x86_64-unknown-linux-gnu
checking for DJGPP environment... no
checking for environ variable separator... :
checking for working re_compile_pattern... yes
checking for getopt_long... yes
configure: WARNING: Included lib/regex.c not used
checking whether strerror_r is declared... yes
checking for strerror_r... yes
checking whether strerror_r returns char *... no
checking for strerror... (cached) yes
checking for strerror_r... (cached) yes
checking for vprintf... yes
checking for doprnt... no
checking for ANSI C header files... (cached) yes
checking for working malloc... yes
checking for working realloc... yes
checking for pcre_exec in -lpcre... yes
configure: creating ./config.status
config.status: creating Makefile
config.status: creating lib/Makefile
config.status: creating lib/posix/Makefile
config.status: creating src/Makefile
config.status: creating tests/Makefile
config.status: creating po/Makefile.in
config.status: creating intl/Makefile
config.status: WARNING:  intl/Makefile.in seems to ignore the --datarootdir setting
config.status: creating doc/Makefile
config.status: creating m4/Makefile
config.status: creating vms/Makefile
config.status: creating bootstrap/Makefile
config.status: creating config.h
config.status: config.h is unchanged
config.status: executing depfiles commands
config.status: executing default-1 commands
config.status: creating po/POTFILES
config.status: creating po/Makefile
config.status: executing stamp-h commands

谢谢, 库马尔

5 个答案:

答案 0 :(得分:10)

为什么不为grep二进制文件获取CentOS的SRPM并将它们的编译选项与你的比较?我猜这比让整个StackOverflow社区盲目地在黑暗中探索直到他们遇到某些东西要高效得多。

编辑:您使用的是具有多字节编码的区域设置吗? (注意:如果您不知道这意味着什么,那么答案可能是“是”,因为UTF-8已经成为大多数Linux发行版的默认值,而且RedHat(以及CentOS)确实是第一个切换)。

在这种情况下,GNU grep 狗慢。这不仅适用于GNU grep,而且适用于几乎所有进行某种文本处理的GNU工具。 FSF拒绝接受任何补丁来提高多字节性能,除非这些补丁被证明不会减慢固定宽度编码。但是,由于用于提高多字节编码性能的任何补丁必须至少在某处包含一些if语句,实际上不可能编写一个不在至少减慢固定宽度编码的速度至少是if语句的开销。因此,GNU工具的UTF-8性能将继续吮吸直到时间结束。

无论如何,大多数Linux发行商都没有给老鼠的哔哔 FSF认为是什么,并且无论如何都要修补GNU grep。 Fedora Rawhide SRPM包含一个名为grep-2.5.3-egf-speedup.patch的补丁,它将GNU grep的UTF-8性能提高了几个数量级。 (由于这个补丁已经从2005年开始,我认为它也在CentOS中使用。)这个补丁也用在Mac OSX,Debian,Ubuntu ......中,几乎没有人使用GNU grep作为GNU分发的。多字节编码中的文本处理永远不会像固定宽度编码那样快,但它至少应该是可比较的,而不是50倍(甚至有些人报告的1500倍)。

还有另一个名为dfa-optional的补丁,这使得grep只使用GNU libc的正则表达式引擎而不是自己的引擎,这在处理UTF-8时不仅很多更快,而且还有更少的错误。

因此,您可能希望使用export LC_ALL=POSIX设置重新运行基准测试。如果这可以解决您的问题,则需要应用上述两个补丁中的任何一个。

这两个RedHat错误报告中还提供了更多信息:

故事的寓意:尽管普遍认为,Linux发行商知道他们在做什么,至少有时候。不要猜测它们。

答案 1 :(得分:4)

您使用-O2标记进行了编译。你为什么不使用-03标志。有关gcc提供的优化选项的说明,请参阅here

使用英特尔的ICC编译器也可以帮助提高性能,但这实际上取决于应用程序。此外,它不是免费的。

编辑,我刚看到编译行上的-g标志。删除它,因为它打开调试的东西,这可能会导致非常严重的性能影响

答案 2 :(得分:1)

另一个想法除了-O选项之外,看起来你正在构建调试符号“-g”。

调试通常会增加二进制文件大小,并且可能会降低所述二进制文件的性能,我认为grep图像非常稳定,并且您不需要调试符号。

答案 3 :(得分:1)

您使用的是什么版本的GCC? IIRC,GCC 4经过重新设计,一段时间内使一些优化代码无效。

答案 4 :(得分:0)

由于存在巨大的性能差异,它可能是算法/代码差异,而不仅仅是编译器优化级别的差异。是什么让你怀疑编译器?