Question

我想使用linux终端删除文件夹中包含非唯一数字字符串的文件夹中的所有文件。 E.g：

werrt-110009.jpg => delete
asfff-110009.JPG => delete
asffa-123489.jpg => maintain
asffa-111122.JPG => maintain

有什么建议吗？

Answer 1

您可以使用linux find命令以及-regex参数和-delete参数在一个命令中执行

Answer 2

使用＆＃34; rm＆＃34;命令删除目录

中所有匹配的字符串文件

cd <path-to-directory>/ && rm *110009*

此命令有助于删除所有匹配字符串的文件，并且它不依赖于文件名中字符串的位置。

我被提到rm命令选项作为删除具有匹配字符串的文件的另一个选项。

以下是完成您要求的完整脚本，

#!/bin/sh -eu

#provide the destination fodler path
DEST_FOLDER_PATH="$1"

TEMP_BUILD_DIR="/tmp/$( date +%Y%m%d-%H%M%S)_clenup_duplicate_files"
#++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
clean_up()
{
    if [ -d $TEMP_BUILD_DIR ]; then
        rm -rf $TEMP_BUILD_DIR
    fi
}
trap clean_up EXIT

[ ! -d $TEMP_BUILD_DIR ] && mkdir -p $TEMP_BUILD_DIR
TEMP_FILES_LIST_FILE="$TEMP_BUILD_DIR/folder_file_names.txt"
echo "$(ls $DEST_FOLDER_PATH)" > $TEMP_FILES_LIST_FILE
while read filename
do
    #check files with number pattern
    if [[ "$filename" =~ '([0-9]+)\.' ]]; then
        #fetch the number to find files with similar number
        matching_string="${BASH_REMATCH[1]}"

        # use the extracted number to check if it is unique
        #find the files count with matching_string
        if [ $(ls -1 $DEST_FOLDER_PATH/*$matching_string* | wc -l) -gt 1 ]; then
            rm $DEST_FOLDER_PATH/*$matching_string*
        fi
    fi
    #reload remaining files in folder (this optimizes the loop and speeds up the operation
    #(this helps lot when folder contains more files))
    echo "$(ls $DEST_FOLDER_PATH)" > $TEMP_FILES_LIST_FILE
done < $TEMP_FILES_LIST_FILE

exit 0

如何执行此脚本，

将此脚本保存到文件中 path-to-script / delete_duplicate_files.sh（你可以重命名任何东西你想要的）
使脚本可执行

chmod + x {path-to-script} /delete_duplicate_files.sh
通过提供重复的目录路径来执行脚本需要删除文件（具有匹配数字模式的文件）

{path-to-script} /delete_duplicate_files.sh" {path-to-directory}＆＃34;

Answer 3

我想，我现在才明白你的问题。您希望删除包含非唯一数字值的所有文件（在特定文件夹中）。如果文件名包含也在另一个文件名中找到的值，则要删除这两个文件，对吗？

我就是这样做的（这可能不是最快的方式）：

# put all files in your folder in a list
# for array=(*) to work make sure you have enabled nullglob: shopt -s nullglob
array=(*)
delete=()

for elem in "${array[@]}"; do
    # for each elem in your list extract the number
    num_regex='([0-9]+)\.'
    [[ "$elem" =~ $num_regex ]]
    num="${BASH_REMATCH[1]}"
    # use the extracted number to check if it is unique
    dup_regex="[^0-9]($num)\..+?(\1)"
    # if it is not unique, put the file in the files-to-delete list
    if [[ "${array[@]}" =~ $dup_regex ]]; then
        delete+=("$elem")
    fi
done

# delete all found duplicates
for elem in "${delete[@]}"; do
    rm "$elem"
done

在您的示例中，array将是：

array=(werrt-110009.jpg asfff-110009.JPG asffa-123489.jpg asffa-111122.JPG)

delete的结果是：

delete=(werrt-110009.jpg asfff-110009.JPG)

这是你的意思吗？

在linux终端中查找和删除文件名中包含相同字符串的文件

3 个答案: