Question

我有一个包含成千上万个文件夹的文件夹结构。我希望能够找到所有包含多个.txt文件或多个.jpeg或其他内容的文件夹，而不会看到仅包含该类型单个文件的任何文件夹。

所有文件夹都应只包含一个特定类型的文件，但这并非总是如此，尝试查找它们很繁琐。

请注意，文件夹中可能还包含许多其他文件。

如果可能，我想匹配“ FILE.JPG”和“ file.jpg”，因为它们都匹配“ file”或“ jpg”上的查询。

我一直在做find . -iname "*file*"并手动进行操作。

文件夹包含文件夹，有时深度为3或4级

first/
  second/
     README.txt
     readme.TXT
     readme.txt
     foo.txt
   third/
     info.txt
   third/fourth/
     raksljdfa.txt

应该返回

first/second/README.txt
first/second/readme.TXT
first/second/readme.txt
first/secondfoo.txt```

搜索“ txt”时

和

first/second/README.txt
first/second/readme.TXT
first/second/readme.txt

搜索“自述文件”时

Answer 1

听起来像您想要的东西

find . -type f -print0 |
awk -v re='[.]txt$' '
BEGIN {
    RS = "\0"
    IGNORECASE = 1
}
{
    dir  = gensub("/[^/]+$","",1,$0)
    file = gensub("^.*/","",1,$0)
}
file ~ re {
    dir2files[dir][file]
}
END {
    for (dir in dir2files) {
        if ( length(dir2files[dir]) > 1 ) {
            for (file in dir2files[dir]) {
                print dir "/" file
            }
        }
    }
}'

未经测试，但应该靠近。它对gensub（），IGNORECASE，真正的多维数组和length（array）使用GNU awk。

Answer 2

这种纯Bash代码应该可以做到（带有警告，请参见下文）：

#! /bin/bash

fileglob=$1             # E.g. '*.txt' or '*readme*'

shopt -s nullglob       # Expand to nothing if nothing matches
shopt -s dotglob        # Match files whose names start with '.'
shopt -s globstar       # '**' matches multiple directory levels
shopt -s nocaseglob     # Ignore case when matching

IFS=                    # Disable word splitting

for dir in **/ ; do
    matching_files=( "$dir"$fileglob )
    (( ${#matching_files[*]} > 1 )) && printf '%s\n' "${matching_files[@]}"
done

在运行程序时将要匹配的模式作为参数提供给程序。例如

myprog '*.txt'
myprog '*readme*'

（模式上的引号对于使它们停止与当前目录中的文件匹配是必需的。）

有关代码的警告是：

globstar是Bash 4.0引入的。该代码不适用于较早的Bash。
在Bash 4.3之前，globstar匹配以下符号链接。这可能导致重复的输出，甚至可能由于循环链接而失败。
**/模式扩展到层次结构中所有目录的列表。如果目录数量很多（例如，超过一万），这可能会花费很长时间或使用过多的内存。

如果您的Bash早于4.3，或者您有大量目录，则此代码是更好的选择：

#! /bin/bash

fileglob=$1             # E.g. '*.txt' or '*readme*'

shopt -s nullglob       # Expand to nothing if nothing matches
shopt -s dotglob        # Match files whose names start with '.'
shopt -s nocaseglob     # Ignore case when matching

IFS=                    # Disable word splitting

find . -type d -print0 \
    |   while read -r -d '' dir ; do
            matching_files=( "$dir"/$fileglob )
            (( ${#matching_files[*]} > 1 )) \
                && printf '%s\n' "${matching_files[@]}"
        done

查找包含与正则表达式/ grep多个匹配项的文件夹

2 个答案: