Impala - 找不到文件错误

时间:2015-01-01 07:02:08

标签: hadoop flume impala

我正在使用带有水槽的黑斑羚作为文件流。

问题是水槽是添加扩展名为.tmp的临时文件,然后当它们被删除时,impala查询失败并显示以下消息:

  

后端0:无法打开HDFS文件   HDFS://本地主机:8020 /用户/蜂巢/../ FlumeData.1420040201733.tmp   错误(2):没有这样的文件或目录

如何让impala忽略这个tmp文件,或者不写入它们,或者将它们写入另一个目录?

Flume配置:

### Agent2 - Avro Source and File Channel, hdfs Sink  ###
# Name the components on this agent
Agent2.sources = avro-source  
Agent2.channels = file-channel
Agent2.sinks = hdfs-sink

# Describe/configure Source
Agent2.sources.avro-source.type = avro
Agent2.sources.avro-source.hostname = 0.0.0.0
Agent2.sources.avro-source.port = 11111
Agent2.sources.avro-source.bind = 0.0.0.0

# Describe the sink
Agent2.sinks.hdfs-sink.type = hdfs
Agent2.sinks.hdfs-sink.hdfs.path = hdfs://localhost:8020/user/hive/table/
Agent2.sinks.hdfs-sink.hdfs.rollInterval = 0
Agent2.sinks.hdfs-sink.hdfs.rollCount = 10000
Agent2.sinks.hdfs-sink.hdfs.fileType = DataStream
#Use a channel which buffers events in file
Agent2.channels.file-channel.type = file
Agent2.channels.file-channel.checkpointDir = /home/ubutnu/flume/checkpoint/
Agent2.channels.file-channel.dataDirs = /home/ubuntu/flume/data/

# Bind the source and sink to the channel
Agent2.sources.avro-source.channels = file-channel
Agent2.sinks.hdfs-sink.channel = file-channel

1 个答案:

答案 0 :(得分:3)

我曾经遇到过这个问题。

我升级了hadoop和水槽,它得到了解决。 (从cloudera hadoop cdh-5.2到cdh-5.3)

尝试升级 - hadoop,flume或impala。