如果您想知道我是如何解决的,请转到here。
我有一个oozie工作流程。里面有一个shell动作。
<action name="start_fair_usage">
<shell xmlns="uri:oozie:shell-action:0.1">
<job-tracker>${JOB_TRACKER}</job-tracker>
<name-node>${NAME_NODE}</name-node>
<exec>${start_fair_usage}</exec>
<argument>${today_without_dash}</argument>
<argument>${yesterday_with_dash}</argument>
<file>${start_fair_usage_path}#${start_fair_usage}</file>
<capture-output/>
</shell>
<ok to="END"/>
<error to="KILL"/>
</action>
此操作会启动脚本。 start_fair_usage.sh
echo "today_without_dash="$today_without_dash
echo "yeasterday_with_dash="$yeasterday_with_dash
echo "-----------RUN copy mta-------------"
bash copy_file.sh mta $today_without_dash
echo "-----------RUN copy rcr-------------"
bash copy_file.sh rcr $today_without_dash
echo "-----------RUN copy sub-------------"
bash copy_file.sh sub $today_without_dash
反过来又开始另一个脚本。 copy_file.sh
# directories in which where sub mtr rcr files are kept
echo "directories"
dirs=(
/user/comverse/data/${2}_B
)
# clear the hdfs directory of old files and copy new files
echo "remove old files "${1}
hadoop fs -rm -skipTrash /apps/hive/warehouse/amd.db/fair_usage/fct_evkuzmin/file_${1}/*
for i in $(hadoop fs -ls "${dirs[@]}" | egrep ${1}.gz | awk -F " " '{print $8}')
do
hadoop fs -cp $i /apps/hive/warehouse/amd.db/fair_usage/fct_evkuzmin/file_${1}
echo "copy file - "${i}
done
echo "end copy "${1}" files"
如何启动工作流程以便复制文件?
答案 0 :(得分:1)
我遇到了同样的问题,下面是堆栈跟踪:
2017-07-03 18:07:24,208 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: reportBadBlock encountered RemoteException for block: BP-455427998-10.120.117.100-1466433731629:blk_1140369410_67364810
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category WRITE is not supported in state standby
at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1774)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1313)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.reportBadBlocks(FSNamesystem.java:6263)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.reportBadBlocks(NameNodeRpcServer.java:798)
at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.reportBadBlocks(DatanodeProtocolServerSideTranslatorPB.java:272)
at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28766)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2045)
at org.apache.hadoop.ipc.Client.call(Client.java:1475)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy14.reportBadBlocks(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.reportBadBlocks(DatanodeProtocolClientSideTranslatorPB.java:290)
at org.apache.hadoop.hdfs.server.datanode.ReportBadBlockAction.reportTo(ReportBadBlockAction.java:62)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.processQueueMessages(BPServiceActor.java:988)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:727)
at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:824)
at java.lang.Thread.run(Thread.java:745)
如果您熟悉hadoop RPC,您将知道当RPC客户端(DataNode
)尝试对RPC
服务器进行远程RPC
调用时,会发生上述错误日志( NameNode
),NodeNode
抛出异常,因为它是备用的。因此,上面的一些堆栈跟踪是服务器端堆栈,有些是客户端堆栈跟踪。
但关键是,它对你HDFS
系统有什么不良影响吗?
答案是绝对的没有。
来自BPOffsetService.java
:
void notifyNamenodeReceivingBlock(ExtendedBlock block, String storageUuid) {
checkBlock(block);
ReceivedDeletedBlockInfo bInfo = new ReceivedDeletedBlockInfo(
block.getLocalBlock(), BlockStatus.RECEIVING_BLOCK, null);
for (BPServiceActor actor : bpServices) {
actor.notifyNamenodeBlock(bInfo, storageUuid, false);
}
}
bpServices
存储两个名称节点的rpc列表,包括活动和备用节点。显然,它发送给两个名称节点的同一请求,至少有一个请求将报告类别WRITE是状态待机不支持***&#39;错误,另一个将成功。
所以,不用担心。
在你的hdfs HA配置中,如果你这样配置:
<property>
<name>dfs.ha.namenodes.datahdfsmaster</name>
<value>namenode1,namenode2</value>
</property>
不幸的是,如果namenode1是备用的,那么你将收到很多INFO级别的日志,因为namenode1将被请求进行某些操作,并且NameNode端checkOperation()肯定会抛出一个INFO级别的异常。