根据先前文件中的编号复制和重命名文件

时间:2019-02-26 14:01:23

标签: bash copy rename

美好的一天,

我有一个包含以下文件的目标目录:

V1__baseline.sql
V2__inserts.sql
V3__packages.sql
...
V10_change_table.sql

然后我的源目录包含以下文件:

v000_001_mk-tbl-dwa_ranking.sql
v000_002_mk-tbl-dwa_camp_week.sql
...
...
v000_179_crt_table_stat_flg.sql
v000-180_crt_table_ing_flg.sql
v000-181_crt_table_update_flg.sql

我想做的是将v000_179_crt_table_stat_flg.sql之后的所有现在或将来文件从源复制到目标,并依次重命名目标目录中的文件。目标目录应如下所示:

V1__baseline.sql
V2__inserts.sql
V3__packages.sql
...
V10__change_table.sql
V11__crt_table_ing_flg.sql
V12__crt_table_update_flg.sql

换句话说,目标文件名的格式为V{number}__{name}.sql,而源文件名的格式为v000-{number}_{name}.sql

我该怎么办?我假设我需要一个像这样的命令的聪明的循环脚本:

cp "`ls -Art ${source_dir}/* | tail -n 1`"  ${destination_dir}/

2 个答案:

答案 0 :(得分:1)

粗糙版本-

targetDir=. # adjust as needed
declare -i ctr=1
declare -a found=()
declare -l file
for file in [Vv][0]*            # refine this to get the files you want
do x=${file#v}                  # knock off the leading v
   while [[ "$x" =~ ^[0-9_-] ]] # if leading digits/dashes/underscores
   do x=${x:1}                  # strip them
   done
   found=( V${ctr}__* )         # check for existing enumerator
   while [[ -e "${found[0]}" ]] # if found
   do (( ctr++ ))               # increment
      found=( V${ctr}__* )      # and check again
   done
   mv "$file" "$targetDir/V${ctr}__$x" # move the file
done

请仔细阅读,提出问题,然后进行编辑以满足您的特定需求。

答案 1 :(得分:1)

因为该问题指定重命名和复制而不是重命名和移动文件,所以该解决方案必须大概确保源目录中的文件在目标中不重复。这使解决方案变得复杂。

该脚本不能简单地检查目标文件中是否存在源文件,因为该脚本是在移动过程中被重命名的。运行cmpdiff可能会浪费资源,特别是如果要比较的文件是大型数据库转储(由.sql扩展提示)时。

在下面的解决方案中,我添加了一个清单文件来跟踪已复制了哪些文件,但是如果我为自己构建此文件,则对这种方法不满意。如果清单文件被意外删除或编辑,脚本将无法跟踪已经复制了哪些文件,并且在下次运行时将复制所有文件。目标目录中文件名的顺序索引将被取消。如果可以的话,我认为最好这样做:

  • 重命名源文件以反映其复制状态,此后将这些文件从复制操作中排除
  • 重命名文件并将文件从源目录移动到目标位置

请注意,在进行数字比较时,bash将前导零的数字视为八进制。您可以在提取要比较的数字时删除前导零,但是我在测试条件下使用$((10#$foo))指定十进制数字。我认为这搞砸了Stack Overflow的语法突出显示-错误地将#10#之后的文本作为注释。

#!/bin/bash

# Set source and destination paths
readonly SRC=src
readonly DEST=dest
readonly COPY_MANIFEST="${SRC}"/copied.txt

# $COPY_MANIFEST will keep track of which files have been copied
[[ -f "$COPY_MANIFEST" ]] || touch "$COPY_MANIFEST"

# Get the highest index in destination directory from the file numeric prefix
highest=0
for file in $DEST/*; do
    base=$(basename ${file})
    index=$(echo $base | sed 's/[^0-9]//g')
    # Compare numbers. Convert to decimal format because leading zeros denote octal numbers 
    [[ $((10#$highest)) -le $((10#$index)) ]] && highest=$index
done

# Rename and copy files from source to destination
for original in ${SRC}/*; do
    previously_copied=false

    # Don't process the manifest file
    [[ ${original} = $COPY_MANIFEST ]] && continue

    # If the source directory is empty, exit early
    [[ -f "$original" ]] || { echo "No source files in ${SRC}"; exit;}

    # Check the file has not already been copied - uses a manifest file rather 
    # than using tools like cmp or diff to check for duplicate files.
    while read line; do
        if [[ "${original}" = "${line}" ]]; then
            echo "${original} has already been renamed and copied."
            previously_copied=y
        fi
    done < "$COPY_MANIFEST"
    [[ $previously_copied = y ]] && continue

    # Get the base name of the file
    name=$(basename ${original})

    # Original question asks that all files greater than v000_179_crt_table_stat_flg.sql are copied.
    # If this requirement is not needed, the next 2 lines can be removed
    num=$(echo "$name" | sed 's/V[0-9]*_\([0-9]*\).*/\1/g')
    [[ $((10#$num)) -le $((179)) ]] && { echo "Not eligible"; continue; }

    # Build the new filename and copy
    # Get rid of the prefix, leaving the descriptive name
    name=${name#V[0-9]*_[0-9]*_}
    highest=$(( 10#$highest + 1 ))
    new_name=V${highest}__${name}
    cp ${original} ${DEST}/${new_name}

    # Update the manifest to prevent repeat copying
    echo ${original} >> $COPY_MANIFEST
done