Bash - 正则表达式,用于确定ls -al的输出是文件还是目录并隐藏

时间:2017-01-08 16:18:22

标签: regex linux bash shell

我试图找出运行ls -al的每一行输出是文件还是 目录以及它是否被隐藏并计算每个目录的类型。

编辑:我必须使用find

#!/bin/bash
#declare four different regex statements that match files, hidden files, directories and hidden directories (excluding . and ..)
#based on the output of each line of running ls -al
re_file='^\-[rwx\-]{9}\s[0-9]+\s([a-z_][a-z0-9_]{0,30})\s([a-z_][a-z0-9_]{0,30})\s[0-9]+\s\w{3}\s[0-9]+\s[0-9]{2}:[0-9]{2}\s[^\.](\w|\.)*$'
re_hidden_file='^\-[rwx\-]{9}\s[0-9]+\s([a-z_][a-z0-9_]{0,30})\s([a-z_][a-z0-9_]{0,30})\s[0-9]+\s\w{3}\s[0-9]+\s[0-9]{2}:[0-9]{2}\s\.\w(\w|\.)*$'
re_directory='^d[rwx\-]{9}\s[0-9]+\s([a-z_][a-z0-9_]{0,30})\s([a-z_][a-z0-9_]{0,30})\s[0-9]+\s\w{3}\s[0-9]+\s[0-9]{2}:[0-9]{2}\s[^\.](\w|\.)*$'
re_hidden_directory='^d[rwx\-]{9}\s[0-9]+\s([a-z_][a-z0-9_]{0,30})\s([a-z_][a-z0-9_]{0,30})\s[0-9]+\s\w{3}\s[0-9]+\s[0-9]{2}:[0-9]{2}\s\.\w(\w|\.)*$'
#declare four different counters for each type
file_count=0
hidden_file_count=0
directory_count=0
hidden_directory_count=0
#read through the output of ls -al line by line, assigning x the value of each line
ls -al $1 | while read x; do
  #test if each line matches each of the regex statements, if it does then increment the relevant counter
  if [[ $x =~ $re_file ]] ; then
    file_count+=1
  elif [[ $x =~ $re_hidden_file ]] ; then
    hidden_file_count+=1
  elif [[ $x =~ $re_directory ]] ; then
    directory_count+=1
  elif [[ $x =~ $re_hidden_directory ]] ; then
    hidden_directory_count+=1
  else
    echo "!!!"
  fi
done
total=$((file_count + hidden_file_count + directory_count + hidden_directory_count))
echo "Files found: $file_count (plus $hidden_file_count hidden)"
echo "Directories found: $directory_count (plus $hidden_directory_count hidden)"
echo "Total files and directories: $total"

目前,脚本输出的!!!不匹配ls -al的每一行的任何正则表达式语句,并且所有计数器变量都保留在0。这是输入的一个示例(尽管Bash在完成正则表达式检查之前删除了用于填充的额外空格)。

drwx--x--x  37 username groupname  4096 Jan  8 14:37 .
drwxr-xr-x 235 root     root       4096 Nov 15 12:16 ..
drwx------   3 username groupname  4096 Oct 27 14:35 .adobe
-rw-------   1 username groupname 14458 Dec  5 20:24 .bash_history
-rw-------   1 username groupname  2680 Sep 30 16:12 .bash_profile
-rw-------   1 username groupname  1210 Oct  7 09:40 .bashrc
drwx------  12 username groupname  4096 Dec  6 15:24 .cache
drwxr-xr-x  17 username groupname  4096 Jan  8 14:37 .config
drwx------   4 username groupname  4096 Dec  5 17:51 dir1
drwx------   2 username groupname  4096 Nov 23 12:26 dir2
...

我在online Regex checker上测试了正则表达式,他们按照我的意愿进行评估。我认为这是一个特定于Bash的问题。任何帮助表示赞赏。

2 个答案:

答案 0 :(得分:2)

您不应解析find来获取文件。使用ls代替nul终止或通配。

问题是$ touch a$'\t'b $ touch a$'\n'b $ ls -l a* -rw-r--r-- 1 andrew wheel 0 Jan 8 08:25 a?b -rw-r--r-- 1 andrew wheel 0 Jan 8 08:26 a?b 会为文件名产生不明确的输出,否则这些文件名是合法的文件名。考虑:

\t

\n?的不可打印字符将替换为ls,并使$ touch "a b c " $ touch "a b c " $ ls -al a\ b* -rw-r--r-- 1 andrew wheel 0 Jan 8 08:44 a b c -rw-r--r-- 1 andrew wheel 0 Jan 8 08:44 a b c 中的文件不明确。

尾随空格会发生同样的情况:

find

现在考虑使用$ find . -name "a*" -maxdepth 1 -print0 | xargs -0 printf "'%s'\n" './a b' './a b' './a b c ' './a b c '

$ for fn in a*; do printf "'%s'\n" "$fn"; done
'a  b'
'a
b'
'a b c   '
'a b c      '

或者只是全球化:

file_count=0
hidden_file_count=0
regular_directory_count=0
hidden_directory_count=0

echo "=====regular files and directories:"
for fn in *; do 
    printf "'%s'\n" "$fn" 
    if [ -d "$fn" ]; then
        regular_directory_count=$((regular_directory_count+1))
    else
        file_count=$((file_count+1))
    fi      
done
echo "====hidden files and direcotries:"
for fn in .*; do 
    printf "'%s'\n" "$fn"; 
    if [ -d "$fn" ]; then
        hidden_directory_count=$((hidden_directory_count+1))
    else
        hidden_file_count=$((hidden_file_count+1))
    fi          
done

printf "Regular files: %s regular directories: %s\n" $file_count $regular_directory_count
printf "Hidden files:  %s hidden directories:  %s\n" $hidden_file_count $hidden_directory_count
tf=$((hidden_file_count+file_count))
td=$((hidden_directory_count+regular_directory_count))
printf "Total files:   %s total directories:   %s\n"  $tf $td

如果你想获得包含隐藏文件和目录的总目录和总文件,只需将其添加到你的glob模式中:

$ ls -la
total 0
drwxr-xr-x   9 andrew  wheel   306 Jan  8 11:07 .
drwxrwxrwt  92 root    wheel  3128 Jan  8 10:58 ..
drwxr-xr-x   2 andrew  wheel    68 Jan  8 11:07 .hidden dir
-rw-r--r--   1 andrew  wheel     0 Jan  8 11:26 .hidden file
-rw-r--r--   1 andrew  wheel     0 Jan  8 11:26 a?b
-rw-r--r--   1 andrew  wheel     0 Jan  8 11:26 a?b
-rw-r--r--   1 andrew  wheel     0 Jan  8 11:26 a b c   
-rw-r--r--   1 andrew  wheel     0 Jan  8 11:26 a b c       
drwxr-xr-x   2 andrew  wheel    68 Jan  8 11:07 regular dir

假设:

=====regular files and directories:
'a  b'
'a
b'
'a b c   '
'a b c       '
'regular dir'
====hidden files and direcotries:
'.'
'..'
'.hidden dir'
'.hidden file'
Regular files: 4 regular directories: 1
Hidden files:  1 hidden directories:  3
Total files:   5 total directories:   4

运行它,你得到:

.

如果要排除..GLOBIGNORE=".:.."个隐藏目录,可以在使用.* glob模式之前设置id

答案 1 :(得分:2)

我花了一段时间才开始工作。

我的方法:避免解析ls -l的输出。特别是在这里你不需要它。启用选项,以便*循环中的for看到隐藏的对象,并根据对象类型测试每个对象(使用shopt)。

另外:a+=1没有按照您的想法行事。它只是在字符串的末尾附加1

#!/bin/bash
#declare four different regex statements that match files, hidden files, directories and hidden directories (excluding . and ..)
#based on the output of each line of running ls -al
re_hidden_file='^\..*'
#declare four different counters for each type
file_count=0
hidden_file_count=0
directory_count=0
hidden_directory_count=0

# enable hidden files/directories
shopt -s dotglob
#read through the output of ls -al line by line, assigning x the value of each line
for x in * ; do
  #test if each line matches each of the regex statements, if it does then increment the relevant counter
  if [ -d "$x" ] ; then
  if [[ "$x" =~ $re_hidden_file ]] ; then
    hidden_directory_count=$((hidden_directory_count+1))
  else
    directory_count=$((directory_count+1))
  fi
  else

  if [[ "$x" =~ $re_hidden_file ]] ; then
    hidden_file_count=$((hidden_file_count+1))
  else
    file_count=$((file_count+1))
   fi
   fi
done


total=$((file_count + hidden_file_count + directory_count + hidden_directory_count))
echo "Files found: $file_count (plus $hidden_file_count hidden)"
echo "Directories found: $directory_count (plus $hidden_directory_count hidden)"
echo "Total files and directories: $total"