我正在尝试在Eclipse中运行MapReduce作业。
我正在尝试连接到Hortonworks VM并读取HDFS中的一个文件。以下是HDFS中文件的显示:
我使用以下代码访问该文件:
FileInputFormat.setInputPaths(conf, new Path("hdfs://127.0.0.1:8020/user/hue/smallClaimData.txt"));
我相信这条路是正确的,因为我第一次尝试运行它时出现错误:"文件不存在"。我添加了用户文件夹名称(我第一次省略了)并且该错误消失了。因此,我认为我在HDFS中正确引用了这个文件 但是,当我运行mapreduce作业时,我收到以下错误(警告:它很长很丑,但我希望它是冗长的,希望它会有所帮助):
[main] WARN org.apache.hadoop.conf.Configuration - file:/tmp/hadoop-user/mapred/local/localRunner/user/job_local1865934580_0001/job_local1865934580_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
[main] WARN org.apache.hadoop.conf.Configuration - file:/tmp/hadoop-user/mapred/local/localRunner/user/job_local1865934580_0001/job_local1865934580_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
[main] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://localhost:8080/
[Thread-11] INFO org.apache.hadoop.mapred.LocalJobRunner - OutputCommitter set in config null
[Thread-11] INFO org.apache.hadoop.mapred.LocalJobRunner - OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
[Thread-11] DEBUG org.apache.hadoop.mapred.LocalJobRunner - Starting thread pool executor.
[Thread-11] DEBUG org.apache.hadoop.mapred.LocalJobRunner - Max local threads: 1
[Thread-11] DEBUG org.apache.hadoop.mapred.LocalJobRunner - Map tasks to process: 1
[LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.LocalJobRunner - Starting task: attempt_local1865934580_0001_m_000000_0
[Thread-11] INFO org.apache.hadoop.mapred.LocalJobRunner - Waiting for map tasks
[LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.mapred.SortedRanges - currentIndex 0 0:0
[LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.mapred.LocalJobRunner - mapreduce.cluster.local.dir for child : /tmp/hadoop-user/mapred/local/localRunner//user/jobcache/job_local1865934580_0001/attempt_local1865934580_0001_m_000000_0
[LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.mapred.Task - using new api for output committer
[LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.yarn.util.ProcfsBasedProcessTree - ProcfsBasedProcessTree currently is supported only on Linux.
[LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.Task - Using ResourceCalculatorProcessTree : null
[LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.MapTask - Processing split: hdfs://127.0.0.1:8020/user/hue/smallClaimData.txt:0+142
[LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.MapTask - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
[LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.MapTask - (EQUATOR) 0 kvi 26214396(104857584)
[LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.MapTask - mapreduce.task.io.sort.mb: 100
[LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.MapTask - soft limit at 83886080
[LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.MapTask - bufstart = 0; bufvoid = 104857600
[LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.MapTask - kvstart = 26214396; length = 6553600
[IPC Parameter Sending Thread #0] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1508440322) connection to /127.0.0.1:8020 from user sending #2
[IPC Client (1508440322) connection to /127.0.0.1:8020 from user] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1508440322) connection to /127.0.0.1:8020 from user got value #2
[LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine - Call: getBlockLocations took 6ms
[LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.hdfs.DFSClient - newInfo = LocatedBlocks{
fileLength=142
underConstruction=false
blocks=[LocatedBlock{BP-1200952396-10.0.2.15-1398089695400:blk_1073742320_1629; getBlockSize()=142; corrupt=false; offset=0; locs=[10.0.2.15:50010]}]
lastLocatedBlock=LocatedBlock{BP-1200952396-10.0.2.15-1398089695400:blk_1073742320_1629; getBlockSize()=142; corrupt=false; offset=0; locs=[10.0.2.15:50010]}
isLastBlockComplete=true}
[LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.hdfs.DFSClient - Connecting to datanode 10.0.2.15:50010
[IPC Client (1508440322) connection to /127.0.0.1:8020 from user] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1508440322) connection to /127.0.0.1:8020 from user: closed
[IPC Client (1508440322) connection to /127.0.0.1:8020 from user] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1508440322) connection to /127.0.0.1:8020 from user: stopped, remaining connections 0
[LocalJobRunner Map Task Executor #0] WARN org.apache.hadoop.hdfs.DFSClient - Failed to connect to /10.0.2.15:50010 for block, add to deadNodes and continue. org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.0.2.15:50010]
org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.0.2.15:50010]
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:532)
at org.apache.hadoop.hdfs.DFSInputStream.newTcpPeer(DFSInputStream.java:955)
at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:1107)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:533)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:749)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:793)
at java.io.DataInputStream.read(DataInputStream.java:100)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:164)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.hdfs.DFSClient - Could not obtain BP-1200952396-10.0.2.15-1398089695400:blk_1073742320_1629 from any node: java.io.IOException: No live nodes contain current block. Will get new block locations from namenode and retry...
[LocalJobRunner Map Task Executor #0] WARN org.apache.hadoop.hdfs.DFSClient - DFS chooseDataNode: got # 1 IOException, will wait for 595.1956215159421 msec.
[LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.ipc.Client - The ping interval is 60000 ms.
[LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.ipc.Client - Connecting to /127.0.0.1:8020
[IPC Client (1508440322) connection to /127.0.0.1:8020 from user] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1508440322) connection to /127.0.0.1:8020 from user: starting, having connections 1
[IPC Parameter Sending Thread #1] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1508440322) connection to /127.0.0.1:8020 from user sending #3
[IPC Client (1508440322) connection to /127.0.0.1:8020 from user] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1508440322) connection to /127.0.0.1:8020 from user got value #3
[LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine - Call: getBlockLocations took 9ms
[LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.hdfs.DFSClient - newInfo = LocatedBlocks{
fileLength=142
underConstruction=false
blocks=[LocatedBlock{BP-1200952396-10.0.2.15-1398089695400:blk_1073742320_1629; getBlockSize()=142; corrupt=false; offset=0; locs=[10.0.2.15:50010]}]
lastLocatedBlock=LocatedBlock{BP-1200952396-10.0.2.15-1398089695400:blk_1073742320_1629; getBlockSize()=142; corrupt=false; offset=0; locs=[10.0.2.15:50010]}
isLastBlockComplete=true}
[LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.hdfs.DFSClient - Connecting to datanode 10.0.2.15:50010
[IPC Client (1508440322) connection to /127.0.0.1:8020 from user] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1508440322) connection to /127.0.0.1:8020 from user: closed
[IPC Client (1508440322) connection to /127.0.0.1:8020 from user] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1508440322) connection to /127.0.0.1:8020 from user: stopped, remaining connections 0
[LocalJobRunner Map Task Executor #0] WARN org.apache.hadoop.hdfs.DFSClient - Failed to connect to /10.0.2.15:50010 for block, add to deadNodes and continue. org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.0.2.15:50010]
org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.0.2.15:50010]
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:532)
at org.apache.hadoop.hdfs.DFSInputStream.newTcpPeer(DFSInputStream.java:955)
at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:1107)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:533)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:749)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:793)
at java.io.DataInputStream.read(DataInputStream.java:100)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:164)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.hdfs.DFSClient - Could not obtain BP-1200952396-10.0.2.15-1398089695400:blk_1073742320_1629 from any node: java.io.IOException: No live nodes contain current block. Will get new block locations from namenode and retry...
[LocalJobRunner Map Task Executor #0] WARN org.apache.hadoop.hdfs.DFSClient - DFS chooseDataNode: got # 2 IOException, will wait for 3865.511256846443 msec.
[LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.ipc.Client - The ping interval is 60000 ms.
[LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.ipc.Client - Connecting to /127.0.0.1:8020
[IPC Client (1508440322) connection to /127.0.0.1:8020 from user] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1508440322) connection to /127.0.0.1:8020 from user: starting, having connections 1
[IPC Parameter Sending Thread #2] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1508440322) connection to /127.0.0.1:8020 from user sending #4
[IPC Client (1508440322) connection to /127.0.0.1:8020 from user] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1508440322) connection to /127.0.0.1:8020 from user got value #4
[LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine - Call: getBlockLocations took 9ms
[LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.hdfs.DFSClient - newInfo = LocatedBlocks{
fileLength=142
underConstruction=false
blocks=[LocatedBlock{BP-1200952396-10.0.2.15-1398089695400:blk_1073742320_1629; getBlockSize()=142; corrupt=false; offset=0; locs=[10.0.2.15:50010]}]
lastLocatedBlock=LocatedBlock{BP-1200952396-10.0.2.15-1398089695400:blk_1073742320_1629; getBlockSize()=142; corrupt=false; offset=0; locs=[10.0.2.15:50010]}
isLastBlockComplete=true}
[LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.hdfs.DFSClient - Connecting to datanode 10.0.2.15:50010
[IPC Client (1508440322) connection to /127.0.0.1:8020 from user] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1508440322) connection to /127.0.0.1:8020 from user: closed
[IPC Client (1508440322) connection to /127.0.0.1:8020 from user] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1508440322) connection to /127.0.0.1:8020 from user: stopped, remaining connections 0
[LocalJobRunner Map Task Executor #0] WARN org.apache.hadoop.hdfs.DFSClient - Failed to connect to /10.0.2.15:50010 for block, add to deadNodes and continue. org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.0.2.15:50010]
org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.0.2.15:50010]
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:532)
at org.apache.hadoop.hdfs.DFSInputStream.newTcpPeer(DFSInputStream.java:955)
at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:1107)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:533)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:749)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:793)
at java.io.DataInputStream.read(DataInputStream.java:100)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:164)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.hdfs.DFSClient - Could not obtain BP-1200952396-10.0.2.15-1398089695400:blk_1073742320_1629 from any node: java.io.IOException: No live nodes contain current block. Will get new block locations from namenode and retry...
[LocalJobRunner Map Task Executor #0] WARN org.apache.hadoop.hdfs.DFSClient - DFS chooseDataNode: got # 3 IOException, will wait for 12531.690669475103 msec.
[LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.ipc.Client - The ping interval is 60000 ms.
[LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.ipc.Client - Connecting to /127.0.0.1:8020
[IPC Client (1508440322) connection to /127.0.0.1:8020 from user] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1508440322) connection to /127.0.0.1:8020 from user: starting, having connections 1
[IPC Parameter Sending Thread #3] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1508440322) connection to /127.0.0.1:8020 from user sending #5
[IPC Client (1508440322) connection to /127.0.0.1:8020 from user] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1508440322) connection to /127.0.0.1:8020 from user got value #5
[LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine - Call: getBlockLocations took 16ms
[LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.hdfs.DFSClient - newInfo = LocatedBlocks{
fileLength=142
underConstruction=false
blocks=[LocatedBlock{BP-1200952396-10.0.2.15-1398089695400:blk_1073742320_1629; getBlockSize()=142; corrupt=false; offset=0; locs=[10.0.2.15:50010]}]
lastLocatedBlock=LocatedBlock{BP-1200952396-10.0.2.15-1398089695400:blk_1073742320_1629; getBlockSize()=142; corrupt=false; offset=0; locs=[10.0.2.15:50010]}
isLastBlockComplete=true}
[LocalJobRunner Map Task Executor #0] DEBUG org.apache.hadoop.hdfs.DFSClient - Connecting to datanode 10.0.2.15:50010
[IPC Client (1508440322) connection to /127.0.0.1:8020 from user] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1508440322) connection to /127.0.0.1:8020 from user: closed
[IPC Client (1508440322) connection to /127.0.0.1:8020 from user] DEBUG org.apache.hadoop.ipc.Client - IPC Client (1508440322) connection to /127.0.0.1:8020 from user: stopped, remaining connections 0
[LocalJobRunner Map Task Executor #0] WARN org.apache.hadoop.hdfs.DFSClient - Failed to connect to /10.0.2.15:50010 for block, add to deadNodes and continue. org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.0.2.15:50010]
org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending remote=/10.0.2.15:50010]
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:532)
at org.apache.hadoop.hdfs.DFSInputStream.newTcpPeer(DFSInputStream.java:955)
at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:1107)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:533)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:749)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:793)
at java.io.DataInputStream.read(DataInputStream.java:100)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:164)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[LocalJobRunner Map Task Executor #0] WARN org.apache.hadoop.hdfs.DFSClient - DFS Read
org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1200952396-10.0.2.15-1398089695400:blk_1073742320_1629 file=/user/hue/smallClaimData.txt
at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:838)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:526)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:749)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:793)
at java.io.DataInputStream.read(DataInputStream.java:100)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:164)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.MapTask - Starting flush of map output
[Thread-11] INFO org.apache.hadoop.mapred.LocalJobRunner - Map task executor complete.
[Thread-11] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local1865934580_0001
java.lang.Exception: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1200952396-10.0.2.15-1398089695400:blk_1073742320_1629 file=/user/hue/smallClaimData.txt
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:403)
Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-1200952396-10.0.2.15-1398089695400:blk_1073742320_1629 file=/user/hue/smallClaimData.txt
at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:838)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:526)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:749)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:793)
at java.io.DataInputStream.read(DataInputStream.java:100)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
at org.apache.hadoop.mapreduce.lib.input.LineRecordReader.nextKeyValue(LineRecordReader.java:164)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
[Thread-11] DEBUG org.apache.hadoop.security.UserGroupInformation - PrivilegedAction as:user (auth:SIMPLE) from:org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:329)
[Thread-1] DEBUG org.apache.hadoop.ipc.Client - Stopping client
我的假设是它超时,因为Hortonworks不允许它连接,可能是因为权限/用户问题?我现在已经研究了一段时间,但是没有取得多大进展。
答案 0 :(得分:2)
我遇到了类似的问题,我的代码试图连接到10.0.2.15
,但我的网络中没有公开。我在OS X 10.9.5上使用VirtualBox 4.3.20。
我通过以下方式解决了这个问题:
vboxnet0
(如果有,只需点击螺丝刀)192.168.56.1
(在我的情况下),掩码255.255.255.0
192.168.56.100
,界限.101-.254 vboxnet0
只是为了验证一切正常:
vboxnet0
192.168.56.101
而不是10.0.2.15
现在我可以从我的mac连接到沙箱HDFS。
答案 1 :(得分:0)
类似的事情发生在我身上。感谢@ Zdenek的回答,我通过禁用所有其他网络并在VM的设置中仅启用仅主机网络来修复它。
在我的机器主持人中,我添加了
192.168.56.101 sandbox.hortonworks.com
在java代码中我使用192.168.56.101
进行连接。
无需编辑VM中的任何/ etc / host文件。
更多详情here。