我是ZooKeeper锁的新手,并尝试了解一些旧代码失败。
涉及Haddop MapReduce的作业以zk-flock开头:
0 1 * * * <cron.user> /usr/bin/zk-flock ...
但是根据作业日志,它根本没有启动(尽管cron尝试了)。
然后我在作业开始时查看了zk-flock日志:
2020-08-07 01:00:01 INFO 84190 Logger has been initialized successfully
2020-08-07 01:00:01 INFO 84188 Logger has been initialized successfully
2020-08-07 01:00:01 INFO 84190 Connected to Zookeeper successfully
2020-08-07 01:00:01 INFO 84190 Auth using digest
2020-08-07 01:00:01 INFO 84188 Connected to Zookeeper successfully
2020-08-07 01:00:01 INFO 84188 Auth using digest
2020-08-07 01:00:01 INFO 84191 Logger has been initialized successfully
2020-08-07 01:00:01 INFO 84186 Logger has been initialized successfully
2020-08-07 01:00:01 INFO 84192 Logger has been initialized successfully
2020-08-07 01:00:01 INFO 84193 Logger has been initialized successfully
2020-08-07 01:00:01 INFO 84186 Connected to Zookeeper successfully
2020-08-07 01:00:01 INFO 84186 Auth using digest
2020-08-07 01:00:01 INFO 84192 Connected to Zookeeper successfully
2020-08-07 01:00:01 INFO 84191 Connected to Zookeeper successfully
2020-08-07 01:00:01 INFO 84192 Auth using digest
2020-08-07 01:00:01 INFO 84191 Auth using digest
2020-08-07 01:00:01 INFO 84193 Connected to Zookeeper successfully
2020-08-07 01:00:01 INFO 84193 Auth using digest
2020-08-07 01:00:03 INFO 84190 on_auth: state 0, result 0
2020-08-07 01:00:03 INFO 84190 Lock: fail
2020-08-07 01:00:03 INFO 84188 on_auth: state 0, result 0
2020-08-07 01:00:03 INFO 84188 Lock: fail
2020-08-07 01:00:03 INFO 84191 on_auth: state 0, result 0
2020-08-07 01:00:03 INFO 84186 on_auth: state 0, result 0
2020-08-07 01:00:03 INFO 84192 on_auth: state 0, result 0
2020-08-07 01:00:03 INFO 84193 on_auth: state 0, result 0
2020-08-07 01:00:03 INFO 84186 Lock: fail
2020-08-07 01:00:03 INFO 84192 Lock: fail
2020-08-07 01:00:03 INFO 84193 Lock: fail
2020-08-07 01:00:03 INFO 84191 Lock: fail
我的假设是,在开始工作时并未释放锁定,但是我无法在Google上搜索到什么日志消息。而且我暂时无法访问map-reduce日志。
如果我前一天(工作开始时)查看zk-flock日志:
2020-08-06 01:00:01 INFO 16591 Logger has been initialized successfully
2020-08-06 01:00:01 INFO 16587 Logger has been initialized successfully
2020-08-06 01:00:01 INFO 16588 Logger has been initialized successfully
2020-08-06 01:00:01 INFO 16586 Logger has been initialized successfully
2020-08-06 01:00:01 INFO 16589 Logger has been initialized successfully
2020-08-06 01:00:01 INFO 16588 Connected to Zookeeper successfully
2020-08-06 01:00:01 INFO 16588 Auth using digest
2020-08-06 01:00:01 INFO 16589 Connected to Zookeeper successfully
2020-08-06 01:00:01 INFO 16589 Auth using digest
2020-08-06 01:00:01 INFO 16586 Connected to Zookeeper successfully
2020-08-06 01:00:01 INFO 16586 Auth using digest
2020-08-06 01:00:01 INFO 16590 Logger has been initialized successfully
2020-08-06 01:00:01 INFO 16587 Connected to Zookeeper successfully
2020-08-06 01:00:01 INFO 16591 Connected to Zookeeper successfully
2020-08-06 01:00:01 INFO 16587 Auth using digest
2020-08-06 01:00:01 INFO 16591 Auth using digest
2020-08-06 01:00:01 INFO 16590 Connected to Zookeeper successfully
2020-08-06 01:00:01 INFO 16590 Auth using digest
2020-08-06 01:00:03 INFO 16588 on_auth: state 0, result 0
2020-08-06 01:00:03 INFO 16587 on_auth: state 0, result 0
2020-08-06 01:00:03 INFO 16586 on_auth: state 0, result 0
2020-08-06 01:00:03 INFO 16591 on_auth: state 0, result 0
2020-08-06 01:00:03 INFO 16589 on_auth: state 0, result 0
2020-08-06 01:00:03 INFO 16588 Lock: success
2020-08-06 01:00:03 INFO 16589 Lock: success
2020-08-06 01:00:03 INFO 16590 on_auth: state 0, result 0
2020-08-06 01:00:03 INFO 16586 Lock: success
2020-08-06 01:00:03 INFO 16591 Lock: success
2020-08-06 01:00:03 INFO 16587 Lock: success
2020-08-06 01:00:03 INFO 16590 Lock: success
2020-08-06 01:00:03 INFO 16590 Start subprocess: /var/tass/tokenizer/release-current/cluster.tass.cit.sh (PID: 16656)
我看到它开始了。
在尝试启动作业时,看起来好像没有解除作业的锁定吗?