按文件名的长度对文件名进行排序

时间:2012-03-26 07:50:17

标签: unix sorting

ls显示目录中可用的文件。我希望根据文件名的长度显示文件名。

任何帮助都将受到高度赞赏。 在此先感谢

5 个答案:

答案 0 :(得分:9)

你可以这样做

for i in `ls`; do LEN=`expr length $i`; echo $LEN $i; done | sort -n

答案 1 :(得分:5)

最简单的方法就是:

$ ls | perl -e 'print sort { length($b) <=> length($a) } <>'

答案 2 :(得分:3)

制作测试文件:

mkdir -p test; cd test 
touch short-file-name  medium-file-name  loooong-file-name

脚本:

ls |awk '{print length($0)"\t"$0}' |sort -n |cut --complement -f1

输出:

short-file-name
medium-file-name
loooong-file-name

答案 3 :(得分:1)

TL; DR

命令:

find . -maxdepth 1 -type f -print0 | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | perl -F'/\0/' -ape '$_=join("\n", sort { length($b) <=> length($a) } @F)' | sed 's#/#/\\n/#g'

更容易阅读的替代版本的命令:

find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
  '$_=join("\n", sort { length($b) <=> length($a) } @F)' | \
sed 's#/#/\\n/#g'

Not Parsing ls输出 AND基准测试

这里有很好的答案。但是,如果要遵循建议not to parse the output of ls,可以通过以下几种方法来完成工作。 这将特别照顾文件名中有空格的情况。 我将在这里对所有内容以及paring-ls示例进行基准测试。 (希望我很快能解决这个问题。)我放入了一些随机文件名,这些文件名是在过去25年左右的时间里从不同位置下载的-最初是73个。所有73个都是“普通”文件名,只有字母数字字符,下划线,点和连字符。我将再添加2个(为了显示某些问题)。

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ mkdir ../dir_w_fnames__spaces

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ cp ./* ../dir_w_fnames__spaces/

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ cd ../dir_w_fnames__spaces/

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ touch "just one file with a really long filename that can throw off some counts bla so there"

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ mkdir ../dir_w_fnames__spaces_and_newlines

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ cp ./* ../dir_w_fnames__spaces_and_newlines/

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ cd ../dir_w_fnames__spaces_and_newlines/

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ touch $'w\nlf.aa'

这个,即文件名,

w
lf.aa

代表 w ,是 l ine f eed-我这样做是为了使发现问题更容易。我不知道为什么选择.aa作为文件扩展名,除了它使该文件名长度易于在各种形式中可见之外。

现在,我要回到orig_dir_73目录; 请相信我,该目录仅包含文件。我们将使用surefire方法获取文件数。

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ du --inodes
74      .

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ # The 74th inode is for the current directory, '.'; we have 73 files

还有一种更确定的方法,它不依赖于仅包含文件的目录,也不要求您记住额外的'.'索引节点。我只是浏览了man页,做了一些研究,并做了一些实验。该命令是

awk -F"\0" '{print NF-1}' < <(find . -maxdepth 1 -type f -print0) | awk '{sum+=$1}END{print sum}'

或以更具可读性的方式

awk -F"\0" '{print NF-1}' < \
  <(find . -maxdepth 1 -type f -print0) | \
    awk '{sum+=$1}END{print sum}'

让我们找出我们有多少文件

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ awk -F"\0" '{print NF-1}' < \
  <(find . -maxdepth 1 -type f -print0) | \
    awk '{sum+=$1}END{print sum}'
73

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ cd ../dir_w_fnames__spaces

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ awk -F"\0" '{print NF-1}' < \
  <(find . -maxdepth 1 -type f -print0) | \
    awk '{sum+=$1}END{print sum}'
74

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ cd ../dir_w_fnames__spaces_and_newlines/

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ awk -F"\0" '{print NF-1}' < \
  <(find . -maxdepth 1 -type f -print0) | \
    awk '{sum+=$1}END{print sum}'
75

(有关详细信息,请参见 [1] ,以及先前导致此命令的先前解决方案的边缘情况。)

我将在这些目录之间来回切换;只要确保您注意路径-我不会注意每一个开关。


*即使在奇怪的文件名(包含空格,换行符等)下也可以使用

1a。 Perlà@tchrist加上其他内容

使用 find 带有空分隔符。乱砍文件名中的换行符。

命令:

find . -maxdepth 1 -type f -print0 | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | perl -F'/\0/' -ape '$_=join("\n", sort { length($b) <=> length($a) } @F)' | sed 's#/#/\\n/#g'

更容易阅读的替代版本的命令:

find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
  '$_=join("\n", sort { length($b) <=> length($a) } @F)' | \
sed 's#/#/\\n/#g'

我实际上将显示部分排序结果,以显示以下命令有效。我还将展示如何检查奇怪的文件名没有破坏任何内容。

请注意,如果一个人想要整个排序列表(希望不是sordid列表),通常不会使用headtail。我正在使用这些命令进行演示。

首先是“普通”文件名。

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
  '$_=join("\n", sort { length($b) <=> length($a) } @F)' | \
sed 's#/#/\\n/#g' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
17f09d51d6280fb8393d5f321f344f616c461a57a8b9cf9cc3099f906b567c992.txt

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
  '$_=join("\n", sort { length($b) <=> length($a) } @F)' | \
sed 's#/#/\\n/#g' | tail -n 5
137.csv
13.csv
o6.dat
3.csv
a.dat

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ # No spaces in fnames, so...

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ find . -maxdepth 1 -type f | wc -l
73
  • Works 用于常规文件名

下一个:空格

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
  '$_=join("\n", sort { length($b) <=> length($a) } @F)' | \
sed 's#/#/\\n/#g' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
just one file with a really long filename that can throw off some counts bla so there
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
  • Works 用于包含空格的文件名

下一步:换行符

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
  '$_=join("\n", sort { length($b) <=> length($a) } @F)' | \
sed 's#/#/\\n/#g' | tail -8
Lk3f.png
LOqU.txt
137.csv
w/\n/lf.aa
13.csv
o6.dat
3.csv
a.dat

如果愿意,还可以稍微更改此命令,以便文件名以换行符“ evaluated”出现。

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ find . -maxdepth 1 -type f -print0 | \
sed 's#\./.*/\([^/]\+\)\./$#\1#g' | tr '\n' '/' | \
perl -F'/\0/' -ape \
  '$_=join("\n", sort { length($b) <=> length($a) } @F)' | \
sed 's#/#\n#g' | tail -8
LOqU.txt
137.csv
w
lf.aa
13.csv
o6.dat
3.csv
a.dat

无论哪种情况,由于我们一直在做,您会知道列表已经排序,即使它没有出现。

(未显示根据文件名长度排序的图像)

********
********
*******
**********       <-- Visual Problem
*****
*****
****
****

OR

********
*******
*                <-- Visual
****             <-- Problems
*****
*****
****
****
  • Works 用于包含换行符的文件名

* 2a。非常接近,但不能将换行符文件名放在一起-àla @cpasm

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ for i in *; do printf "%d\t%s\n" "${#i}" "$i"; done | sort -n | cut -f2- | head
lf.aa
3.csv
a.dat
13.csv
o6.dat
137.csv
w
1UG5.txt
1uWj.txt
2Ese.txt

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ for i in *; do printf "%d\t%s\n" "${#i}" "$i"; done | sort -n | cut -f2- | tail -5
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
just one file with a really long filename that can throw off some counts bla so there
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt

请注意,head部分中的w

w(\n)
lf.aa

的正确位置是长度为6个字符的文件名。但是,lf.aa不在逻辑上。


*难以破解(只有'\ n'和可能的命令字符可能是个问题)

1b。 Perlà@tchrist使用find,而不是ls

使用 find 带有空分隔符和 xargs

命令:

find . -maxdepth 1 -type f -print0 | xargs -I'{}' -0 echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | perl -e 'print sort { length($b) <=> length($a) } <>'

更容易阅读的替代版本的命令:

find . -maxdepth 1 -type f -print0 | \
  xargs -I'{}' -0 \
    echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | \
        perl -e 'print sort { length($b) <=> length($a) } <>'

我们去吧。

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ find . -maxdepth 1 -type f -print0 | \
  xargs -I'{}' -0 \
    echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | \
      perl -e 'print sort { length($b) <=> length($a) } <>' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
17f09d51d6280fb8393d5f321f344f616c461a57a8b9cf9cc3099f906b567c992.txt

bballdave025@MY-MACHINE /home/bballdave025/orig_dir_73
$ find . -maxdepth 1 -type f -print0 | \
  xargs -I'{}' -0 \
    echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | \
      perl -e 'print sort { length($b) <=> length($a) } <>' | tail -8
IKlT.txt
Lk3f.png
LOqU.txt
137.csv
13.csv
o6.dat
3.csv
a.dat
  • Works 用于常规文件名
bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces
$ find . -maxdepth 1 -type f -print0 | \
  xargs -I'{}' -0 \
    echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | \
      perl -e 'print sort { length($b) <=> length($a) } <>' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
just one file with a really long filename that can throw off some counts bla so there
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
  • Works 用于包含空格的文件名
bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ find . -maxdepth 1 -type f -print0 | \
  xargs -I'{}' -0 \
    echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | \
      perl -e 'print sort { length($b) <=> length($a) } <>' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
just one file with a really long filename that can throw off some counts bla so there
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ find . -maxdepth 1 -type f -print0 | \
  xargs -I'{}' -0 \
    echo "{}" | sed 's#\./.*/\([^/]\+\)\./$#\1#g' | 
      perl -e 'print sort { length($b) <=> length($a) } <>' | tail -8
LOqU.txt
137.csv
13.csv
o6.dat
3.csv
a.dat
lf.aa
w

警告

  • BREAKS 用于包含换行符的文件名

1c。适用于普通文件名和带空格的文件名,但可用于包含换行符的文件名-àla @tchrist

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ ls | perl -e 'print sort { length($b) <=> length($a) } <>' | head -n 5
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
just one file with a really long filename that can throw off some counts bla so there
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ ls | perl -e 'print sort { length($b) <=> length($a) } <>' | tail -8
LOqU.txt
137.csv
13.csv
o6.dat
3.csv
a.dat
lf.aa
w

3a。适用于普通文件名和带空格的文件名,但可用于包含换行符的文件名-àla @Peter_O

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ ls | awk '{print length($0)"\t"$0}' | sort -n | cut --complement -f1 | head -n 8
w
3.csv
a.dat
lf.aa
13.csv
o6.dat
137.csv
1UG5.txt

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ ls | awk '{print length($0)"\t"$0}' | sort -n | cut --complement -f1 | tail -5
83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
just one file with a really long filename that can throw off some counts bla so there
oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt

*更容易破坏

4a。适合普通文件名-àla @Raghuram

此版本可以使用包含空格或换行符(或同时包含两者)的文件名来破坏。

我确实想补充一点,即使只是出于分析目的,我也喜欢显示实际的字符串长度。

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ for i in `ls`; do LEN=`expr length $i`; echo $LEN $i; done | sort -n | head -n 20
1 a
1 w
2 so
3 bla
3 can
3 off
3 one
4 file
4 just
4 long
4 some
4 that
4 with
5 3.csv
5 a.dat
5 lf.aa
5 there
5 throw
6 13.csv
6 counts

bballdave025@MY-MACHINE /home/bballdave025/dir_w_fnames__spaces_and_newlines
$ for i in `ls`; do LEN=`expr length $i`; echo $LEN $i; done | sort -n | tail -5
69 17f09d51d6280fb8393d5f321f344f616c461a57a8b9cf9cc3099f906b567c992.txt
70 83dfee2e0f8560dbd2a681a5a40225fd260d3b428b962dcfb75d17e43a5fdec9_1.txt
76 79496d6167652f71526c586e654345744a365939773d3d2d343538sfp6m8o1m53hlwlfja.dat
87 oinwrxK2ea1sfp6m8o49255f679496d6167652f71526c586e654345744a365939773d3d2d343538b3e0.csv
238 68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f776174747061642d6d656469612d736572766963652f53746f7279496d6167652f71526c586e654345744a365939773d3d2d3435383139353437362e313464633462356336326266656365303439363432373931333139382e676966.txt

某些命令的说明

现在,我只需要注意的是,通过通用作品find命令,我用'/'来代替换行符,因为它是文件名中唯一非法的字符都在* NIX和Windows上。


注释

[1] 使用的命令,

du --inodes --files0-from=<(find . -maxdepth 1 -type f -print0) | \
awk '{sum+=int($1)}END{print sum}'

在这种情况下将起作用,因为当文件中包含换行符时,find的{​​{1}}命令的输出中因此会有一个“多余”行。函数将对该链接的文本求值为0。具体来说,对于包含换行符的文件名awk,即

int

我们会得到

w\nlf.aa

如果文件名类似于

w lf.aa

$ awk '{print int($1)}' < <(echo "lf.aa")
0

好吧,我想电脑已经把我打败了。如果有人有解决方案,我会很高兴听到。

编辑,我认为我对这个问题太了解了。通过this SO answer和实验,我得到了此命令(虽然我不了解所有细节,但是已经对其进行了很好的测试。)

firstline\n3 and some other\n1\n2\texciting\n86stuff.jpg

更可读:

firstline
3 and some other
1
2     exciting
86stuff.jpg

答案 4 :(得分:0)

for i in *; do printf "%d\t%s\n" "${#i}" "$i"; done | sort -n | cut -f2-