我正在研究一个计算集群,我有一个非常奇怪的/ usr / bin / env行为......总之,它的工作速度非常慢。 在头节点上:
$ time /usr/bin/env which
<which output>
real 0m0.025s
user 0m0.001s
sys 0m0.001s
在计算节点上:
$ qsub -I
qsub: waiting for job 176620.scyld.localdomain to start
qsub: job 176620.scyld.localdomain ready
-bash-3.2$ time which
<which output>
real 0m0.003s
user 0m0.000s
sys 0m0.003s
-bash-3.2$ time /usr/bin/env /usr/bin/which
<which output>
real 0m0.003s
user 0m0.000s
sys 0m0.003s
-bash-3.2$ time /usr/bin/env which
<which output>
real 5m0.003s
user 0m0.001s
sys 0m0.001s
ps ax 报告此:
12884 pts/3 S+ 0:00 /usr/bin/env which
打印使用横幅需要5分钟。任何想法为什么会发生这种情况?
修改1:
有关其中的其他信息:
-bash-3.2$ type -a which
which is /usr/bin/which
-bash-3.2$ file /usr/bin/which
/usr/bin/which: ELF 64-bit LSB executable, AMD x86-64, version 1 (SYSV), for GNU/Linux 2.6.9, dynamically linked (uses shared libs), stripped
-bash-3.2$ echo $PATH
/bin:/usr/bin:/home/gusev/.rvm/bin:/home/gusev/bin
修改2
我strace
'd /usr/bin/env which
而且它被卡在了
execve("/bin/which", ["which"], [/* 47 vars */]
现在正在运行一个简单的
/bin/which
也卡住了,但这个文件不存在:
-bash-3.2$ ls /bin/which
ls: /bin/which: No such file or directory
/bin
挂载在NFS上:
-bash-3.2$ mount | grep bin
10.54.0.1:/bin on /bin type nfs (nolock,nonfatal)
10.54.0.1:/usr/bin on /usr/bin type nfs (nolock,nonfatal)
所以这可能是一个网络问题......
编辑3:
which which
完美无缺:
-bash-3.2$ time which which
/usr/bin/which
real 0m0.002s
user 0m0.000s
sys 0m0.002s
strace -e trace=execve /usr/bin/env which
的输出是
execve("/usr/bin/env", ["/usr/bin/env", "which"], [/* 47 vars */]) = 0
execve("/bin/which", ["which"], [/* 47 vars */]) = -1 ENOENT (No such file or directory)
execve("/usr/bin/which", ["which"], [/* 47 vars */]) = 0
<which output>
编辑4:
挂起时间总是5分钟。看起来它是某种默认值超时。
答案 0 :(得分:0)
可能是导致问题的是which
命令,而不是env
命令。
因为你看到的结果非常不同
time /usr/bin/env /usr/bin/which
VS
time /usr/bin/env which
您的which
可能还有另一个$PATH
命令,可能在/usr/local/bin
或$HOME/bin
。 type -a which
告诉你什么?你的$PATH
看起来像什么?
请注意which
可以是shell脚本或可执行文件。如果它是一个shell脚本,请尝试抓取它的副本并添加set -x
以查看它正在做什么。
答案 1 :(得分:0)
此问题以及your previous question中描述的问题似乎是execve
需要很长时间才能返回计算机笔记引起的。路径中的dirs是NFS挂载的事实可能是一个促成因素。
通过strace
运行命令,我们看到env
使用对execve
的重复调用来探测每条路径中是否存在命令:
[me@home]$ echo $PATH
/home/me/bin:/usr/lib/lightdm/lightdm:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/home/me/work/bin
[me@home]$ strace -e execve /usr/bin/env which
execve("/usr/bin/env", ["/usr/bin/env", "which"], [/* 53 vars */]) = 0
execve("/home/me/bin/which", ["which"], [/* 53 vars */]) = -1 ENOENT (No such file or directory)
execve("/usr/lib/lightdm/lightdm/which", ["which"], [/* 53 vars */]) = -1 ENOENT (No such file or directory)
execve("/usr/local/sbin/which", ["which"], [/* 53 vars */]) = -1 ENOENT (No such file or directory)
execve("/usr/local/bin/which", ["which"], [/* 53 vars */]) = -1 ENOENT (No such file or directory)
execve("/usr/sbin/which", ["which"], [/* 53 vars */]) = -1 ENOENT (No such file or directory)
execve("/usr/bin/which", ["which"], [/* 53 vars */]) = 0
正如您在上述评论中所确认的那样,which which
不会遇到同样的问题,因为它使用stat
代替execve
来探测路径:
[me@home]$ strace -e execve,stat /usr/bin/which which
execve("/usr/bin/which", ["/usr/bin/which", "which"], [/* 53 vars */]) = 0
stat("/home/me", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat(".", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
stat("/home/me/bin/which", 0x7fff79ae8760) = -1 ENOENT (No such file or directory)
stat("/usr/lib/lightdm/lightdm/which", 0x7fff79ae8760) = -1 ENOENT (No such file or directory)
stat("/usr/local/sbin/which", 0x7fff79ae8760) = -1 ENOENT (No such file or directory)
stat("/usr/local/bin/which", 0x7fff79ae8760) = -1 ENOENT (No such file or directory)
stat("/usr/sbin/which", 0x7fff79ae8760) = -1 ENOENT (No such file or directory)
stat("/usr/bin/which", {st_mode=S_IFREG|0755, st_size=946, ...}) = 0
/usr/bin/which
我担心无法提出解决潜在问题的任何建议,但在同一时间你可以通过以下方式解决问题:
env
为您解决这些问题。如果您真的希望使用env
,请尽可能重新排序$PATH
以最小化搜索。 E.g:
PATH=/usr/bin:$PATH /usr/bin/env which # place most likely path first
答案 2 :(得分:0)
最后,我发现我有一个很长的PATH
环境变量。并且可能它以某种方式影响了调用NFS共享的execve
。
所以我将一堆可执行文件移动到了一个signle目录中,并用{1}替换了PATH
中的许多条目。从那以后,我没有遇到任何问题。