使用Hadoop设置mrjob失败,错误“返回非零退出状态256”

时间:2015-07-23 03:18:25

标签: python hadoop mrjob

我是关于mrjob和hadoop的新手,在我构建我的hadoop集群之后,我尝试使用mrjob将工作提交给hadoop, 但不幸的是,它失败了,错误“返回非零退出状态256”。更多细节如下:

1.这是我的例子:

from mrjob.job import MRJob

import re

WORD_RE = re.compile(r"[\w']+")


class MRWordFreqCount(MRJob):

    def mapper(self, _, line):
        for word in WORD_RE.findall(line):
            yield (word.lower(), 1)

    def combiner(self, word, counts):
        yield (word, sum(counts))

    def reducer(self, word, counts):
        yield (word, sum(counts))


if __name__ == '__main__':
     MRWordFreqCount.run()

2。我使用这个命令:

python test.py -r hadoop  --python-bin=/root/.pyenv/versions/2.7.9/bin/python   ./pg20417.txt  

3。这就是我得到的结果:

```的xml HADOOP:工作没有成功!

HADOOP:流命令失败!

作业失败,返回码为256:['/diskb/dxb/code/hadoop-2.7.1/bin/hadoop', 'jar', '/diskb/dxb/code/hadoop-2.7.1/share/hadoop/tools/lib/hadoop-streaming-2.7.1.jar', '-files', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/test.py#test.py,hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/setup-wrapper.sh#setup-wrapper.sh', '-archives', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/mrjob.tar.gz#mrjob.tar.gz', '-input', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/pg20417.txt', '-output', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/output', '-mapper', 'sh -ex setup-wrapper.sh /root/.pyenv/versions/2.7.9/bin/python test.py --step-num=0 --mapper', '-combiner', 'sh -ex setup-wrapper.sh /root/.pyenv/versions/2.7.9/bin/python test.py --step-num=0 --combiner', '-reducer', 'sh -ex setup-wrapper.sh /root/.pyenv/versions/2.7.9/bin/python test.py --step-num=0 --reducer']

扫描日志以查找可能的失败原因

追踪(最近一次呼叫最后一次):

文件“test.py”,第25行,

MRWordFreqCount.run()

文件“/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/mrjob/job.py”,第461行,在运行中

mr_job.execute()

文件“/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/mrjob/job.py”,第479行,执行

super(MRJob,self).execute()

文件“/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/mrjob/launch.py​​”,第151行,执行

self.run_job()

文件“/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/mrjob/launch.py​​”,第214行,在run_job中

runner.run()

文件“/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/mrjob/runner.py”,第464行,在运行中

self._run()

文件“/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/mrjob/hadoop.py”,第237行,在_run中

self._run_job_in_hadoop()

文件“/root/.pyenv/versions/2.7.9/lib/python2.7/site-packages/mrjob/hadoop.py”,第372行,在_run_job_in_hadoop

引发CalledProcessError(returncode,step_args)

subprocess.CalledProcessError: Command '['/diskb/dxb/code/hadoop-2.7.1/bin/hadoop', 'jar', '/diskb/dxb/code/hadoop-2.7.1/share/hadoop/tools/lib/hadoop-streaming-2.7.1.jar', '-files', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/test.py#test.py,hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/setup-wrapper.sh#setup-wrapper.sh', '-archives', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/mrjob.tar.gz#mrjob.tar.gz', '-input', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/files/pg20417.txt', '-output', 'hdfs:///user/root/tmp/mrjob/test.root.20150723.011910.649661/output', '-mapper', 'sh -ex setup-wrapper.sh /root/.pyenv/versions/2.7.9/bin/python test.py --step-num=0 --mapper', '-combiner', 'sh -ex setup-wrapper.sh /root/.pyenv/versions/2.7.9/bin/python test.py --step-num=0 --combiner', '-reducer', 'sh -ex setup-wrapper.sh /root/.pyenv/versions/2.7.9/bin/python test.py --step-num=0 --reducer']' returned non-zero exit status 256

4.我的环境是:

hadoop2.7.1
python2.7.9

0 个答案:

没有答案