HDFS:如何删除与模式匹配的文件

时间:2018-11-27 21:12:20

标签: bash hdfs

FI具有文件mydata.YYYY-MM-DD.log的列表,并将后缀作为日期。例如:mydata.2018-11-26.log

我如何编写将删除日期后缀早于任意日期的所有文件的hdfs,例如:2018-11-20

Thx,

1 个答案:

答案 0 :(得分:0)

也许是这样的:

使用正则表达式查找文件 对于找到的每个文件,提取日期,与给定的文件进行比较,如果较小,则采取措施。

#! /bin/bash

SCRIPT=`basename "$0"`

if [ $# -lt 2 ] ||[ "$1" == "" ] || [ "$1" == "--help" ] || [ "$1" == "-h" ] || [ "$1" == "/?" ]; then
    echo "$SCRIPT: Usage: [directory-path] [cutoff-date]"
    echo "$SCRIPT: Deletes files in the given path newer than the specified date YYYY-MM-DD"
    exit 1
fi

# TODO - check $CUTOFF_DATE is valid
# TODO - check $DIR is valid
DIR="$1"
CUTOFF="$2"

# Find files ending with YYYY-MM-DD.log
find "$DIR" -iname \*\[0-9\]\[0-9\]\[0-9\]\[0-9\]-\[0-9\]\[0-9\]-\[0-9\]\[0-9\].log |
while read file
do
    date_part=`echo "$file" | rev | cut -c5-14 | rev`
    if [ "$date_part" \< "$CUTOFF" ]; then
        echo "$SCRIPT: should remove \"$file\""
        # rm -f "$file"
    fi
done