我正在尝试编写用于将文件复制到hdfs的python脚本。我正在研究ubuntu并安装了hadoop和pydoop。以下代码是我的脚本:
import pydoop.hdfs as hdfs
class COPYTOHDFS():
local_path = '/home/user/test.txt'
hdfs_path = '/testfile'
host = 'master'
port = 9000
hdfsobj = hdfs.hdfs(host, port, user='cloudera-user', groups=['supergroup'])
hdfsobj.copy(local_path, hdfsobj, hdfs_path)
错误就在这里:
Traceback (most recent call last):
File "COPYTOHDFS.py", line 3, in <module>
class COPYTOHDFS():
File "COPYTOHDFS.py", line 10, in COPYTOHDFS
hdfsobj.copy(local_path, hdfsobj, hdfs_path)
File "/usr/local/lib/python2.7/dist-packages/pydoop-0.5.2_rc2-py2.7-linux-x86_64.egg/pydoop/hdfs.py", line 458, in copy
return super(hdfs, self).copy(from_path, to_hdfs, to_path)
IOError: Cannot copy /home/user/test.txt to filesystem on master
错误没有详细说明。有什么想法吗?
答案 0 :(得分:2)
在conf / core-site.xml中,您可以为fs操作设置tmp目录。 如果您忘记在这些目录上设置正在运行的用户的所有权和权限,那么会产生IO异常,请检查。