Shell script that validates file in hdfs

时间:2018-09-18 20:14:48

标签: bash hadoop hdfs oozie oozie-workflow

I'm trying to make a script to check if there is any file missing in a hdfs path. The idea is to include it in an oozie workflow that when no file is found fails and does not continue with the flow

ALTRAFI="/input/ALTrafi*.txt"
ALTDATOS="/ALTDatos*.txt"
ALTARVA="/ALTarva*.txt"
TRAFCIER="/TrafCier*.txt"

if hdfs dfs -test -e $ALTRAFI; then
   echo "[$ALTRAFI] Archive not found"
   exit 1
fi
if hdfs dfs -test -e $ALTDATOS; then
   echo "[$ALTDATOS] Archive not found"
   exit 2
fi
if hdfs dfs -test -e $ALTARVA; then
   echo "[$ALTARVA] Archive not found"
   exit 3
fi   
if hdfs dfs -test -e $TRAFCIER; then
   echo "[$TRAFCIER] Archive not found"
   exit 4
fi 

But the script does not fail when it does not find a file and continues the flow of the workflow in oozie

oozie flow:

<start to="ValidateFiles"/>
<kill name="Kill">
    <message>Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<action name="ValidateFiles">
    <shell xmlns="uri:oozie:shell-action:0.2">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <configuration>
            <property>
              <name>mapred.job.queue.name</name>
              <value>${queueName}</value>
            </property>
            <property>
                <name>tez.lib.uris</name>
                <value>/hdp/apps/2.5.0.0-1245/tez/tez.tar.gz</value>
            </property>
        </configuration>
        <exec>/produccion/apps/traficotasado/ValidateFiles.sh</exec>
        <file>/produccion/apps/traficotasado/ValidateFiles.sh#ValidateFiles.sh</file> <!--Copy the executable to compute node's current working directory -->
    </shell>
    <ok to="CopyFiles"/>
    <error to="Kill"/>
</action>   
  <action name="CopyFiles">
    <shell xmlns="uri:oozie:shell-action:0.2">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <configuration>
            <property>
              <name>mapred.job.queue.name</name>
              <value>${queueName}</value>
            </property>
            <property>
                <name>tez.lib.uris</name>
                <value>/hdp/apps/2.5.0.0-1245/tez/tez.tar.gz</value>
            </property>
        </configuration>
        <exec>/produccion/apps/traficotasado/CopyFiles.sh</exec>
        <file>/produccion/apps/traficotasado/CopyFiles.sh#CopyFiles.sh</file> <!--Copy the executable to compute node's current working directory -->
    </shell>
    <ok to="DepuraFilesStage"/>
    <error to="Kill"/>
</action>

Thanks for the help

0 个答案:

没有答案