WebHDFS / HttpFS - 上传jar文件无法正常工作

时间:2016-06-16 11:46:30

标签: hadoop curl hdfs webhdfs httpfs

我正在尝试将Java jar文件上传到运行HttpFs网关后面的WebHDFS的HDFS群集。

我试过这个卷曲:

$ curl -v -X PUT --data-binary @myfile.jar "http://myhost:14000/webhdfs/v1/user/myuser/myfile.jar?op=CREATE&user.name=myuser" -H "Content-Type: application/octet-stream"

似乎有效:

*   Trying myhost...
* Connected to smyhost (myhost) port 14000 (#0)
> PUT /webhdfs/v1/user/myuser/myfile.jar?op=CREATE&user.name=myuser HTTP/1.1
> Host: myhost:14000
> User-Agent: curl/7.43.0
> Accept: */*
> Content-Type: application/octet-stream
> Content-Length: 2566043
> Expect: 100-continue
> 
< HTTP/1.1 100 Continue
* We are completely uploaded and fine
< HTTP/1.1 307 Temporary Redirect
< X-Powered-By: Express
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Methods: HEAD, POST, GET, OPTIONS, DELETE
< Access-Control-Allow-Headers: origin, content-type, X-Auth-Token, Tenant-ID, Authorization
< server: Apache-Coyote/1.1
< set-cookie: hadoop.auth="u=myuser&p=myuser&t=simple&e=1466112799770&s=nf0V1RauYozVoVVvR+PxHZnGJ1E="; Version=1; Path=/; Expires=Thu, 16-Jun-2016 21:33:11 GMT; HttpOnly
< location: http://myhost:14000/webhdfs/v1/user/myuser/myfile.jar?op=CREATE&user.name=myuser&data=true
< Content-Type: application/json; charset=utf-8
< content-length: 0
< date: Thu, 16 Jun 2016 11:33:11 GMT
< connection: close
< 
* Closing connection 0
* Issue another request to this URL: 'http://myhost:14000/webhdfs/v1/user/myuser/myfile.jar?op=CREATE&user.name=myuser&data=true'
*   Trying myhost...
* Connected to myhost (myhost) port 14000 (#1)
> PUT /webhdfs/v1/user/myuser/myfile.jar?op=CREATE&user.name=myuser&data=true HTTP/1.1
> Host: myhost:14000
> User-Agent: curl/7.43.0
> Accept: */*
> Content-Type: application/octet-stream
> Content-Length: 2566043
> Expect: 100-continue
> 
< HTTP/1.1 100 Continue
* We are completely uploaded and fine
< HTTP/1.1 201 Created
< X-Powered-By: Express
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Methods: HEAD, POST, GET, OPTIONS, DELETE
< Access-Control-Allow-Headers: origin, content-type, X-Auth-Token, Tenant-ID, Authorization
< server: Apache-Coyote/1.1
< set-cookie: hadoop.auth="u=myuser&p=myuser&t=simple&e=1466112820064&s=p0i2IQ4Nbn2zytazKB1hHe3Dv+4="; Version=1; Path=/; Expires=Thu, 16-Jun-2016 21:33:48 GMT; HttpOnly
< Content-Type: application/json; charset=utf-8
< content-length: 0
< date: Thu, 16 Jun 2016 11:33:48 GMT
< connection: close
< 
* Closing connection 1

但是当试图使用jar时,我收到一个错误:

$ sudo -u myuser hadoop fs -copyToLocal /user/myuser/myfile.jar /home/myuser
$ sudo -u myuser jar -tf /home/myuser/myfile.jar
java.util.zip.ZipException: error in opening zip file
    at java.util.zip.ZipFile.open(Native Method)
    at java.util.zip.ZipFile.<init>(ZipFile.java:215)
    at java.util.zip.ZipFile.<init>(ZipFile.java:145)
    at java.util.zip.ZipFile.<init>(ZipFile.java:116)
    at sun.tools.jar.Main.list(Main.java:1004)
    at sun.tools.jar.Main.run(Main.java:245)
    at sun.tools.jar.Main.main(Main.java:1177)

重点是HDFS中jar文件的大小远远大于原始文件,因此怀疑它没有正确上传:

$ ls -la myfile.jar 
-rw-r--r--  1 myuser  myuser  2566043 14 jun 16:11 myfile.jar
$ sudo -u myuser hadoop fs -ls /user/myuser
Found 1 items
-rwxr-xr-x   3 myuser myuser    4620153 2016-06-16 13:10 /user/myuser/myfile.jar

在curl,使用-T--data-binary没有任何区别。我在想是问题出在Content-Type标题上,因此我尝试了binary/octet-stream。然而,HttpFS返回HTTP Status 400 - Data upload requests must have content-type set to 'application/octet-stream'

任何提示?

0 个答案:

没有答案