在hadoop集群中运行mrjob python脚本时出错

时间:2019-08-30 23:58:15

标签: python hadoop hdfs

嗨,我想通过python脚本对电影收视率进行排序,但是我遇到了错误

`[root@sandbox-hdp maria_dev]# python RatingsBreakdown.py -r hadoop --hadoop-streaming-jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-streaming.jar u.data
No configs found; falling back on auto-configuration
No configs specified for hadoop runner
Looking for hadoop binary in $PATH...
Found hadoop binary: /usr/bin/hadoop
Using Hadoop version 3.1.1.3.0.1.0
Creating temp directory /tmp/RatingsBreakdown.maria_dev.20190830.233300.332634
STDERR: mkdir: Permission denied: user=root, access=WRITE, inode="/user/maria_dev"                     :maria_dev:hdfs:drwxr-xr-x
Traceback (most recent call last):
File "RatingsBreakdown.py", line 19, in <module>
RatingsBreakdown.run()
File "/usr/lib/python2.7/site-packages/mrjob/job.py", line 446, in run
mr_job.execute()
File "/usr/lib/python2.7/site-packages/mrjob/job.py", line 473, in execute
super(MRJob, self).execute()
File "/usr/lib/python2.7/site-packages/mrjob/launch.py", line 202, in execute
self.run_job()
File "/usr/lib/python2.7/site-packages/mrjob/launch.py", line 247, in run_job
return self._handle(name, path, path)
File "/usr/lib/python2.7/site-packages/mrjob/fs/composite.py", line 118, in _han                     dle
return getattr(fs, name)(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/mrjob/fs/hadoop.py", line 298, in mkdir
raise IOError("Could not mkdir %s" % path)
IOError: Could not mkdir hdfs:///user/maria_dev/tmp/mrjob/RatingsBreakdown.maria_d                     ev.20190830.233300.332634/files/wd`

您能在这里描述什么问题吗

2 个答案:

答案 0 :(得分:0)

答案 1 :(得分:0)

我发现Hortonworks需要很多时间才能启动 当我正确启动时,它工作正常 启动花了大约1个小时