Question

我正在尝试wc -l整个目录，然后在带有行数的回显中显示文件名。

令我沮丧的是，该目录必须来自传递的参数。所以没有看起来很愚蠢，有人可以先告诉我为什么一个简单的wc -l $1不能给我在参数中输入的目录的行数吗？我知道我不完全理解它。

最重要的是，如果给出的参数不是目录或者有多个参数，我也需要验证。

与往常一样，你的男人很棒。

Answer 1

wc适用于文件而非目录，因此，如果您希望对目录中的所有文件使用单词count，则应从以下开始：

wc -l $1/*

通过各种旋转来摆脱总数，对它进行排序并仅提取最大值，你最终会得到类似的东西（为了便于阅读而分成多行但是应该输入一行）：

pax> wc -l $1/* 2>/dev/null
       | grep -v ' total$'
       | sort -n -k1
       | tail -1l

2892 target_dir/big_honkin_file.txt

关于验证，您可以使用以下内容检查传递给脚本的参数数量：

if [[ $# -ne 1 ]] ; then
    echo 'Whoa! Wrong parameteer count'
    exit 1
fi

你可以检查它是否是一个目录：

if [[ ! -d $1 ]] ; then
    echo 'Whoa!' "[$1]" 'is not a directory'
    exit 1
fi

Answer 2

我正在尝试wc -l整个目录，然后显示带有行数的回声中的文件名。

您可以在目录上执行find并使用-exec选项触发wc -l。像这样：

$ find ~/Temp/perl/temp/ -exec wc -l '{}' \;
wc: /Volumes/Data/jaypalsingh/Temp/perl/temp/: read: Is a directory
      11 /Volumes/Data/jaypalsingh/Temp/perl/temp//accessor1.plx
      25 /Volumes/Data/jaypalsingh/Temp/perl/temp//autoincrement.pm
      12 /Volumes/Data/jaypalsingh/Temp/perl/temp//bless1.plx
      14 /Volumes/Data/jaypalsingh/Temp/perl/temp//bless2.plx
      22 /Volumes/Data/jaypalsingh/Temp/perl/temp//classatr1.plx
      27 /Volumes/Data/jaypalsingh/Temp/perl/temp//classatr2.plx
       7 /Volumes/Data/jaypalsingh/Temp/perl/temp//employee1.pm
      18 /Volumes/Data/jaypalsingh/Temp/perl/temp//employee2.pm
      26 /Volumes/Data/jaypalsingh/Temp/perl/temp//employee3.pm
      12 /Volumes/Data/jaypalsingh/Temp/perl/temp//ftp.plx
      14 /Volumes/Data/jaypalsingh/Temp/perl/temp//inherit1.plx
      16 /Volumes/Data/jaypalsingh/Temp/perl/temp//inherit2.plx
      24 /Volumes/Data/jaypalsingh/Temp/perl/temp//inherit3.plx
      33 /Volumes/Data/jaypalsingh/Temp/perl/temp//persisthash.pm

Answer 3

好问题！

我看到了答案。有些非常好。 find ...|xrags是我最喜欢的。无论如何，使用find ... -exec wc -l {} +语法可以简化它。但有一个问题。当命令行缓冲区已满时，每次wc -l ...行打印机时都会调用<number> total。由于wc没有要停用此功能的参数，因此必须重新实现wc。要使用grep过滤掉这些行并不好：

所以我的完整答案是

#!/usr/bin/bash

[ $# -ne 1 ] && echo "Bad number of args">&2 && exit 1
[ ! -d "$1" ] && echo "Not dir">&2 && exit 1
find "$1" -type f -exec awk '{++n[FILENAME]}END{for(i in n) printf "%8d %s\n",n[i],i}' {} +

或者使用较少的临时空间，但awk中的代码稍大一些：

find "$1" -type f -exec awk 'function pr(){printf "%8d %s\n",n,f}FNR==1{f&&pr();n=0;f=FILENAME}{++n}END{pr()}' {} +

<强>其它

如果不应该为子目录调用，请在-maxdepth 1之前将-type添加到find。
这很快。我担心它会比find ... wc +版本慢得多，但对于包含14770个文件的目录（在几个子目录中），wc版本运行3.8秒，awk版本运行5.2秒
awk和wc不同地考虑不\n个结束的行。最后一行没有\n的行不会被wc计算在内。我更愿意将其视为awk。
它不会打印空文件

Answer 4

这是你想要的吗？

> find ./test1/ -type f|xargs wc -l
       1 ./test1/firstSession_cnaiErrorFile.txt
      77 ./test1/firstSession_cnaiReportFile.txt
   14950 ./test1/exp.txt
       1 ./test1/test1_cnaExitValue.txt
   15029 total

所以你的参数目录应该在这里：

find $your_complete_directory_path/ -type f|xargs wc -l

Answer 5

使用zsh查找当前目录及其子目录中包含大多数行的文件：

lines() REPLY=$(wc -l < "$REPLY")
wc -l -- **/*(D.nO+lined[1])

它定义了一个lines函数，该函数将用作全局排序函数，该函数在$REPLY中返回文件的行数，其路径在$REPLY中给出。 / p>

然后我们使用zsh的递归globbing **/*来查找常规文件（.），数字（n）反向排序（O） lines函数（+lines），然后选择第一个[1]。（D包括dotfiles和遍历dotdirs）。

如果您不想对文件名可能包含的字符（如换行符，空格......）进行假设，那么使用标准实用程序执行它有点棘手。使用大多数Linux发行版中的GNU工具，它们可以更容易处理NUL终止的行：

find . -type f -exec sh -c '
  for file do
    size=$(wc -c < "$file") &&
      printf "%s\0" "$size:$file"
  done' sh {} + |
  tr '\n\0' '\0\n' |
  sort -rn |
  head -n1 |
  tr '\0' '\n'

或者使用zsh或GNU bash语法：

biggest= max=-1
find . -type f -print0 |
  {
    while IFS= read -rd '' file; do
      size=$(wc -l < "$file") &&
        ((size > max)) &&
        max=$size biggest=$file
    done
    [[ -n $biggest ]] && printf '%s\n' "$max: $biggest"
  }

Answer 6

这是一个适合我使用windows下的git bash（mingw32）：

find . -type f -print0| xargs -0 wc -l

这将列出当前目录和子目录中的文件和行数。您还可以将输出定向到文本文件，并在需要时将其导入Excel：

find . -type f -print0| xargs -0 wc -l > fileListingWithLineCount.txt

文件中目录中的行数最多而不是字节

6 个答案: