我目前正在尝试设置分布式Tsung负载测试 我使用Erlang slave功能的环境 在获取控制器节点以启动从站方面失败 节点。 E.g。
(musicglue@load1)1> net:ping(musicglue@load2).
pong
(musicglue@load1)2> slave:start(load2,musicglue,"-setcookie tom").
{error,timeout}
我的环境:
控制器 - 主机名:load1,用户:musicglue,Ubuntu 10.04 LTS, Erlang R15B01从源代码编译而来 Slave - hostname:load2,user:musicglue,Ubuntu 10.04 LTS,Erlang R15B01从源代码编译而来 防火墙禁用 未安装SELinux
正在发挥作用的事情:
Ping输出:
musicglue@load1:~$ erl -rsh ssh -sname musicglue -setcookie tom
Erlang R15B01 (erts-5.9.1) [source] [64-bit] [smp:4:4] [async-threads:
0] [hipe] [kernel-poll:false]
Eshell V5.9.1 (abort with ^G)
(musicglue@load1)1> net:ping(musicglue@load2).
pong
尝试从load1启动从属会话时出现问题 在load2:
musicglue@load1:~$ erl -rsh ssh -sname musicglue -setcookie tom
Erlang R15B01 (erts-5.9.1) [source] [64-bit] [smp:4:4] [async-threads:
0] [hipe] [kernel-poll:false]
Eshell V5.9.1 (abort with ^G)
(musicglue@load1)1> net:ping(musicglue@load2).
pong
(musicglue@load1)2> slave:start(load2,musicglue,"-setcookie
tom").
{error,timeout}
这是我运行slave时从epmd获得的输出:start命令:
epmd: Thu May 24 10:01:57 2012: Non-local peer connected
epmd: Thu May 24 10:01:57 2012: opening connection on file descriptor
4
epmd: Thu May 24 10:01:57 2012: got 12 bytes
***** 00000000 00 0a 7a 6d 75 73 69 63 67 6c 75 65
|..zmusicglue|
epmd: Thu May 24 10:01:57 2012: ** got PORT2_REQ
epmd: Thu May 24 10:01:57 2012: got 2 bytes
***** 00000000 77 01 |w.|
epmd: Thu May 24 10:01:57 2012: ** sent PORT2_RESP (error) for
"musicglue"
epmd: Thu May 24 10:01:57 2012: closing connection on file descriptor
4
epmd: Thu May 24 10:01:57 2012: Local peer connected
epmd: Thu May 24 10:01:57 2012: opening connection on file descriptor
4
epmd: Thu May 24 10:01:57 2012: got 24 bytes
***** 00000000 00 16 78 ca d6 4d 00 00 05 00 05 00 09 6d 75 73
|..x..M.......mus|
***** 00000010 69 63 67 6c 75 65 00 00 |
icglue..|
epmd: Thu May 24 10:01:57 2012: ** got ALIVE2_REQ
epmd: Thu May 24 10:01:57 2012: registering 'musicglue:1', port 51926
epmd: Thu May 24 10:01:57 2012: type 77 proto 0 highvsn 5 lowvsn 5
epmd: Thu May 24 10:01:57 2012: got 4 bytes
***** 00000000 79 00 00 01 |
y...|
epmd: Thu May 24 10:01:57 2012: ** sent ALIVE2_RESP for "musicglue"
epmd: Thu May 24 10:01:57 2012: unregistering 'musicglue:1', port
51926
epmd: Thu May 24 10:01:57 2012: closing connection on file descriptor
4
任何人的任何帮助或建议都将不胜感激,
非常感谢
我还应该提一下,我可以看到load2成功确认了ssh连接,但随后立即断开连接:
May 30 13:49:27 load2 sshd[16169]: Accepted publickey for musicglue from 173.45.236.182 port 51843 ssh2
May 30 13:49:27 load2 sshd[16171]: Received disconnect from 173.45.236.182: 11: disconnected by user
为了回应下面的评论,我还试图使用不同的节点名称为奴隶启动奴隶:
musicglue@load1:~$ erl -rsh ssh -sname musicglue -setcookie tom
Erlang R15B01 (erts-5.9.1) [source] [64-bit] [smp:4:4] [async-threads:0] [hipe] [kernel-poll:false]
Eshell V5.9.1 (abort with ^G)
(musicglue@load1)1> slave:start(load2,bar,"-setcookie tom").
{error,timeout}
和控制器:
musicglue@load1:~$ erl -rsh ssh -sname foo -setcookie tom
Erlang R15B01 (erts-5.9.1) [source] [64-bit] [smp:4:4] [async-threads:0] [hipe] [kernel-poll:false]
Eshell V5.9.1 (abort with ^G)
(foo@load1)1> slave:start(load2,musicglue,"-setcookie tom").
{error,timeout}
和两者:
musicglue@load1:~$ erl -rsh ssh -sname foo -setcookie tom
Erlang R15B01 (erts-5.9.1) [source] [64-bit] [smp:4:4] [async-threads:0] [hipe] [kernel-poll:false]
Eshell V5.9.1 (abort with ^G)
(foo@load1)1> slave:start(load2,bar,"-setcookie tom").
{error,timeout}
但无济于事
原来我的问题是我的奴隶无法通过SSH连接到控制器,因此无法响应任何命令。
在两个节点之间修复此通信端口后,每个人都完美地工作。
答案 0 :(得分:2)
尝试通过在PATH
中的某个位置创建这样的shell脚本来记录通过SSH进行的操作:
#!/bin/sh
echo "$0" "$@" > /tmp/my-ssh.log
ssh -v "$@" 2>&1 | tee -a /tmp/my-ssh.log
将其称为my-ssh
,使用erl -rsh my-ssh
启动Erlang,然后检查/tmp/my-ssh.log
中的内容。这应该可以解决这个问题......
答案 1 :(得分:1)
通过Google发现此问题的人的替代答案。如果您尝试在单独的计算机上启动服务,则必须解析控制器节点名称。
例如,我正在超时:
> node().
someName@host.domain.com
> slave:start('192.168.122.196',bar,"-setcookie cookie").
{error,timeout}
通过使用显式域名启动我的erlang实例:
erl -name someName@192.168.1.5 -setcookie cookie
> slave:start('192.168.122.196',bar,"-setcookie cookie").
此命令现在成功。