我正在尝试用Python编写HDFS。 现在,我正在使用https://hdfscli.readthedocs.io/en/latest/quickstart.html 但是对于大文件我会回来:
File "/home/edge7/venv-dev/local/lib/python2.7/site-packages/hdfs/client.py", line 400, in write
consumer(data)
File "/home/edge7/venv-dev/local/lib/python2.7/site-packages/hdfs/client.py", line 394, in consumer
auth=False,
File "/home/edge7/venv-dev/local/lib/python2.7/site-packages/hdfs/client.py", line 179, in _request
**kwargs
File "/home/edge7/venv-dev/local/lib/python2.7/site-packages/requests/sessions.py", line 465, in request
resp = self.send(prep, **send_kwargs)
File "/home/edge7/venv-dev/local/lib/python2.7/site-packages/requests/sessions.py", line 573, in send
r = adapter.send(request, **kwargs)
File "/home/edge7/venv-dev/local/lib/python2.7/site-packages/requests/adapters.py", line 415, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', gaierror(-2, 'Name or service not known'))
我的写作代码非常简单:
client = InsecureClient('http://xxxxxxx.co:50070', user='hdfs')
client.write("/tmp/a",stringToWrite)
任何人都可以建议在HDFS上写一个像样的包吗? 干杯
答案 0 :(得分:0)
对于堆栈跟踪,它似乎与安全性有关。您确定需要使用InsecureClient而不是Kerberos吗?另外,请记住库只是HttpF的绑定,因此使用Postman或CURL进行手动测试可以让你调试群集端的任何问题。