我从CDH 4.4.0-1.cdh4.4.0.p0.39升级到CDH 4.5.0-1.cdh4.5.0.p0.30。我在之前的版本中使用了distcp工作,从hdfs到s3:
hadoop distcp -i -update -p hdfs://NN:8020/path/to/directory/folder/ s3n://accesskeyid:keypass@mybucket/directory/;
升级时我现在收到一些奇怪的错误和警告:
WARN httpclient.HttpMethodReleaseInputStream: Attempting to release HttpMethod in finalize() as its response data stream has gone out of scope. This attempt will not always succeed and cannot be relied upon! Please ensure response data streams are always fully consumed or closed to avoid HTTP connection starvation.
13/12/10 11:26:27 WARN httpclient.HttpMethodReleaseInputStream:在finalize()中成功发布了HttpMethod。这次你很幸运...请确保响应数据流始终被完全消耗或关闭。
错误:
Task attempt_201312042223_7900_m_000003_2 failed to report status for 600 seconds. Killing!
attempt_201312042223_7900_m_000003_2: 2013-12-10 10:52:32
attempt_201312042223_7900_m_000003_2: Full thread dump Java HotSpot(TM) 64-Bit Server VM (20.6-b01 mixed mode):
attempt_201312042223_7900_m_000003_2: "org.apache.hadoop.hdfs.PeerCache@4ed9f47" daemon prio=10 tid=0x00007f96ccc8c000 nid=0x2801 waiting on condition [0x00007f96c2f0e000]
attempt_201312042223_7900_m_000003_2: java.lang.Thread.State: TIMED_WAITING (sleeping)
attempt_201312042223_7900_m_000003_2: at java.lang.Thread.sleep(Native Method)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.hdfs.PeerCache.run(PeerCache.java:252)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.hdfs.PeerCache.access$000(PeerCache.java:39)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.hdfs.PeerCache$1.run(PeerCache.java:135)
attempt_201312042223_7900_m_000003_2: at java.lang.Thread.run(Thread.java:662)
attempt_201312042223_7900_m_000003_2: "communication thread" daemon prio=10 tid=0x00007f96ccc87800 nid=0x27f4 in Object.wait() [0x00007f96c327f000]
attempt_201312042223_7900_m_000003_2: java.lang.Thread.State: TIMED_WAITING (on object monitor)
attempt_201312042223_7900_m_000003_2: at java.lang.Object.wait(Native Method)
attempt_201312042223_7900_m_000003_2: - waiting on <0x00000000fcb34ee0> (a java.lang.Object)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.mapred.Task$TaskReporter.run(Task.java:662)
attempt_201312042223_7900_m_000003_2: - locked <0x00000000fcb34ee0> (a java.lang.Object)
attempt_201312042223_7900_m_000003_2: at java.lang.Thread.run(Thread.java:662)
attempt_201312042223_7900_m_000003_2: "Timer thread for monitoring jvm" daemon prio=10 tid=0x00007f96ccc57800 nid=0x27ed in Object.wait() [0x00007f96c3110000]
attempt_201312042223_7900_m_000003_2: java.lang.Thread.State: TIMED_WAITING (on object monitor)
attempt_201312042223_7900_m_000003_2: at java.lang.Object.wait(Native Method)
attempt_201312042223_7900_m_000003_2: - waiting on <0x00000000fcad8078> (a java.util.TaskQueue)
attempt_201312042223_7900_m_000003_2: at java.util.TimerThread.mainLoop(Timer.java:509)
attempt_201312042223_7900_m_000003_2: - locked <0x00000000fcad8078> (a java.util.TaskQueue)
attempt_201312042223_7900_m_000003_2: at java.util.TimerThread.run(Timer.java:462)
attempt_201312042223_7900_m_000003_2: "IPC Parameter Sending Thread #0" daemon prio=10 tid=0x00007f96cca80800 nid=0x27e4 waiting on condition [0x00007f96c33b5000]
attempt_201312042223_7900_m_000003_2: java.lang.Thread.State: TIMED_WAITING (parking)
attempt_201312042223_7900_m_000003_2: at sun.misc.Unsafe.park(Native Method)
attempt_201312042223_7900_m_000003_2: - parking to wait for <0x00000000fcae0080> (a java.util.concurrent.SynchronousQueue$TransferStack)
attempt_201312042223_7900_m_000003_2: at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:196)
attempt_201312042223_7900_m_000003_2: at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:424)
attempt_201312042223_7900_m_000003_2: at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:323)
attempt_201312042223_7900_m_000003_2: at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:874)
attempt_201312042223_7900_m_000003_2: at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:945)
attempt_201312042223_7900_m_000003_2: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
attempt_201312042223_7900_m_000003_2: at java.lang.Thread.run(Thread.java:662)
attempt_201312042223_7900_m_000003_2: "IPC Client (1065524847) connection to /127.0.0.1:38248 from job_201312042223_7900" daemon prio=10 tid=0x00007f96cca98800 nid=0x27e3 in Object.wait() [0x00007f96c34b6000]
attempt_201312042223_7900_m_000003_2: java.lang.Thread.State: TIMED_WAITING (on object monitor)
attempt_201312042223_7900_m_000003_2: at java.lang.Object.wait(Native Method)
attempt_201312042223_7900_m_000003_2: - waiting on <0x00000000fcac0188> (a org.apache.hadoop.ipc.Client$Connection)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.ipc.Client$Connection.waitForWork(Client.java:803)
attempt_201312042223_7900_m_000003_2: - locked <0x00000000fcac0188> (a org.apache.hadoop.ipc.Client$Connection)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.ipc.Client$Connection.run(Client.java:846)
attempt_201312042223_7900_m_000003_2: "Thread for syncLogs" daemon prio=10 tid=0x00007f96cca5c800 nid=0x27e2 waiting on condition [0x00007f96c36bf000]
attempt_201312042223_7900_m_000003_2: java.lang.Thread.State: TIMED_WAITING (sleeping)
attempt_201312042223_7900_m_000003_2: at java.lang.Thread.sleep(Native Method)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.mapred.Child$3.run(Child.java:156)
attempt_201312042223_7900_m_000003_2: "Low Memory Detector" daemon prio=10 tid=0x00007f96cc0be000 nid=0x27dc runnable [0x0000000000000000]
attempt_201312042223_7900_m_000003_2: java.lang.Thread.State: RUNNABLE
attempt_201312042223_7900_m_000003_2: "C2 CompilerThread1" daemon prio=10 tid=0x00007f96cc0bb800 nid=0x27db waiting on condition [0x0000000000000000]
attempt_201312042223_7900_m_000003_2: java.lang.Thread.State: RUNNABLE
attempt_201312042223_7900_m_000003_2: "C2 CompilerThread0" daemon prio=10 tid=0x00007f96cc0b9000 nid=0x27da waiting on condition [0x0000000000000000]
attempt_201312042223_7900_m_000003_2: java.lang.Thread.State: RUNNABLE
attempt_201312042223_7900_m_000003_2: "Signal Dispatcher" daemon prio=10 tid=0x00007f96cc0b6800 nid=0x27d9 waiting on condition [0x0000000000000000]
attempt_201312042223_7900_m_000003_2: java.lang.Thread.State: RUNNABLE
attempt_201312042223_7900_m_000003_2: "Finalizer" daemon prio=10 tid=0x00007f96cc09a000 nid=0x27d8 in Object.wait() [0x00007f96c89f8000]
attempt_201312042223_7900_m_000003_2: java.lang.Thread.State: WAITING (on object monitor)
attempt_201312042223_7900_m_000003_2: at java.lang.Object.wait(Native Method)
attempt_201312042223_7900_m_000003_2: - waiting on <0x00000000fcac0480> (a java.lang.ref.ReferenceQueue$Lock)
attempt_201312042223_7900_m_000003_2: at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
attempt_201312042223_7900_m_000003_2: - locked <0x00000000fcac0480> (a java.lang.ref.ReferenceQueue$Lock)
attempt_201312042223_7900_m_000003_2: at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
attempt_201312042223_7900_m_000003_2: at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
attempt_201312042223_7900_m_000003_2: "Reference Handler" daemon prio=10 tid=0x00007f96cc098000 nid=0x27d7 in Object.wait() [0x00007f96c8af9000]
attempt_201312042223_7900_m_000003_2: java.lang.Thread.State: WAITING (on object monitor)
attempt_201312042223_7900_m_000003_2: at java.lang.Object.wait(Native Method)
attempt_201312042223_7900_m_000003_2: - waiting on <0x00000000fcad0070> (a java.lang.ref.Reference$Lock)
attempt_201312042223_7900_m_000003_2: at java.lang.Object.wait(Object.java:485)
attempt_201312042223_7900_m_000003_2: at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
attempt_201312042223_7900_m_000003_2: - locked <0x00000000fcad0070> (a java.lang.ref.Reference$Lock)
attempt_201312042223_7900_m_000003_2: "main" prio=10 tid=0x00007f96cc00e800 nid=0x27be waiting on condition [0x00007f96d3837000]
attempt_201312042223_7900_m_000003_2: java.lang.Thread.State: WAITING (parking)
attempt_201312042223_7900_m_000003_2: at sun.misc.Unsafe.park(Native Method)
attempt_201312042223_7900_m_000003_2: - parking to wait for <0x00000000f7669c00> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
attempt_201312042223_7900_m_000003_2: at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
attempt_201312042223_7900_m_000003_2: at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
attempt_201312042223_7900_m_000003_2: at com.cloudera.org.apache.http.impl.conn.tsccm.WaitingThread.await(WaitingThread.java:158)
attempt_201312042223_7900_m_000003_2: at com.cloudera.org.apache.http.impl.conn.tsccm.ConnPoolByRoute.getEntryBlocking(ConnPoolByRoute.java:402)
attempt_201312042223_7900_m_000003_2: at com.cloudera.org.apache.http.impl.conn.tsccm.ConnPoolByRoute$1.getPoolEntry(ConnPoolByRoute.java:299)
attempt_201312042223_7900_m_000003_2: at com.cloudera.org.apache.http.impl.conn.tsccm.ThreadSafeClientConnManager$1.getConnection(ThreadSafeClientConnManager.java:242)
attempt_201312042223_7900_m_000003_2: at com.cloudera.org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:456)
attempt_201312042223_7900_m_000003_2: at com.cloudera.org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
attempt_201312042223_7900_m_000003_2: at com.cloudera.org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
attempt_201312042223_7900_m_000003_2: at com.cloudera.org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:334)
attempt_201312042223_7900_m_000003_2: at com.cloudera.org.jets3t.service.impl.rest.httpclient.RestStorageService.performRequest(RestStorageService.java:281)
attempt_201312042223_7900_m_000003_2: at com.cloudera.org.jets3t.service.impl.rest.httpclient.RestStorageService.performRestGet(RestStorageService.java:981)
attempt_201312042223_7900_m_000003_2: at com.cloudera.org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:2150)
attempt_201312042223_7900_m_000003_2: at com.cloudera.org.jets3t.service.impl.rest.httpclient.RestStorageService.getObjectImpl(RestStorageService.java:2087)
attempt_201312042223_7900_m_000003_2: at com.cloudera.org.jets3t.service.StorageService.getObject(StorageService.java:1140)
attempt_201312042223_7900_m_000003_2: at com.cloudera.org.jets3t.service.S3Service.getObject(S3Service.java:2583)
attempt_201312042223_7900_m_000003_2: at com.cloudera.org.jets3t.service.S3Service.getObject(S3Service.java:84)
attempt_201312042223_7900_m_000003_2: at com.cloudera.org.jets3t.service.StorageService.getObject(StorageService.java:525)
attempt_201312042223_7900_m_000003_2: at com.cloudera.org.jets3t.service.S3Service.getObject(S3Service.java:1377)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:118)
attempt_201312042223_7900_m_000003_2: at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
attempt_201312042223_7900_m_000003_2: at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
attempt_201312042223_7900_m_000003_2: at java.lang.reflect.Method.invoke(Method.java:597)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.fs.s3native.$Proxy11.retrieveMetadata(Unknown Source)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:414)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1378)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.tools.DistCp$CopyFilesMapper.rename(DistCp.java:484)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.tools.DistCp$CopyFilesMapper.copy(DistCp.java:461)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:547)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.tools.DistCp$CopyFilesMapper.map(DistCp.java:314)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
attempt_201312042223_7900_m_000003_2: at java.security.AccessController.doPrivileged(Native Method)
attempt_201312042223_7900_m_000003_2: at javax.security.auth.Subject.doAs(Subject.java:396)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
attempt_201312042223_7900_m_000003_2: at org.apache.hadoop.mapred.Child.main(Child.java:262)
attempt_201312042223_7900_m_000003_2: "VM Thread" prio=10 tid=0x00007f96cc091800 nid=0x27d6 runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#0 (ParallelGC)" prio=10 tid=0x00007f96cc021800 nid=0x27bf runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#1 (ParallelGC)" prio=10 tid=0x00007f96cc023800 nid=0x27c0 runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#2 (ParallelGC)" prio=10 tid=0x00007f96cc025800 nid=0x27c1 runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#3 (ParallelGC)" prio=10 tid=0x00007f96cc027000 nid=0x27c2 runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#4 (ParallelGC)" prio=10 tid=0x00007f96cc029000 nid=0x27c3 runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#5 (ParallelGC)" prio=10 tid=0x00007f96cc02b000 nid=0x27c4 runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#6 (ParallelGC)" prio=10 tid=0x00007f96cc02c800 nid=0x27c5 runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#7 (ParallelGC)" prio=10 tid=0x00007f96cc02e800 nid=0x27c6 runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#8 (ParallelGC)" prio=10 tid=0x00007f96cc030800 nid=0x27c7 runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#9 (ParallelGC)" prio=10 tid=0x00007f96cc032000 nid=0x27c8 runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#10 (ParallelGC)" prio=10 tid=0x00007f96cc034000 nid=0x27c9 runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#11 (ParallelGC)" prio=10 tid=0x00007f96cc036000 nid=0x27ca runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#12 (ParallelGC)" prio=10 tid=0x00007f96cc037800 nid=0x27cb runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#13 (ParallelGC)" prio=10 tid=0x00007f96cc039800 nid=0x27cc runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#14 (ParallelGC)" prio=10 tid=0x00007f96cc03b800 nid=0x27cd runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#15 (ParallelGC)" prio=10 tid=0x00007f96cc03d000 nid=0x27ce runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#16 (ParallelGC)" prio=10 tid=0x00007f96cc03f000 nid=0x27cf runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#17 (ParallelGC)" prio=10 tid=0x00007f96cc041000 nid=0x27d0 runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#18 (ParallelGC)" prio=10 tid=0x00007f96cc042800 nid=0x27d1 runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#19 (ParallelGC)" prio=10 tid=0x00007f96cc044800 nid=0x27d2 runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#20 (ParallelGC)" prio=10 tid=0x00007f96cc046800 nid=0x27d3 runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#21 (ParallelGC)" prio=10 tid=0x00007f96cc048000 nid=0x27d4 runnable
attempt_201312042223_7900_m_000003_2: "GC task thread#22 (ParallelGC)" prio=10 tid=0x00007f96cc04a000 nid=0x27d5 runnable
attempt_201312042223_7900_m_000003_2: "VM Periodic Task Thread" prio=10 tid=0x00007f96cc0d1000 nid=0x27dd waiting on condition
attempt_201312042223_7900_m_000003_2: JNI global references: 1678
attempt_201312042223_7900_m_000003_2: Heap
attempt_201312042223_7900_m_000003_2: PSYoungGen total 191168K, used 134794K [0x00000000f2ab0000, 0x0000000100000000, 0x0000000100000000)
attempt_201312042223_7900_m_000003_2: eden space 163904K, 78% used [0x00000000f2ab0000,0x00000000fa91fef0,0x00000000fcac0000)
attempt_201312042223_7900_m_000003_2: from space 27264K, 19% used [0x00000000fcac0000,0x00000000fcff2af8,0x00000000fe560000)
attempt_201312042223_7900_m_000003_2: to space 27264K, 0% used [0x00000000fe560000,0x00000000fe560000,0x0000000100000000)
attempt_201312042223_7900_m_000003_2: PSOldGen total 436928K, used 0K [0x00000000d8000000, 0x00000000f2ab0000, 0x00000000f2ab0000)
attempt_201312042223_7900_m_000003_2: object space 436928K, 0% used [0x00000000d8000000,0x00000000d8000000,0x00000000f2ab0000)
attempt_201312042223_7900_m_000003_2: PSPermGen total 28160K, used 28023K [0x00000000d2e00000, 0x00000000d4980000, 0x00000000d8000000)
attempt_201312042223_7900_m_000003_2: object space 28160K, 99% used [0x00000000d2e00000,0x00000000d495dd20,0x00000000d4980000)
我从先前版本的另一个集群重试了相同的数据集,我没有遇到任何问题。是什么改变了,现在distcp函数不再使用jets3t正确关闭http连接?
非常感谢任何帮助。