Question

假设我想计算项目中的代码行数。如果所有文件都在同一目录中，我可以执行：

cat * | wc -l

但是，如果有子目录，则不起作用。为此，cat必须具有递归模式。我怀疑这可能是xargs的工作，但我想知道是否有更优雅的解决方案？

Answer 1

首先，您不需要使用cat来计算行数。这是antipattern called Useless Use of Cat（UUoC）。要计算当前目录中文件的行数，请使用wc：

wc -l *

然后find命令会递归子目录：

find . -name "*.c" -exec wc -l {} \;

.是从
-name "*.c"是您感兴趣的文件模式
-exec提供要执行的命令
{}是将命令传递给命令的结果（此处为wc-l）
\;表示命令的结束

此命令会生成一个包含其行数的所有文件的列表，如果您想要找到所有的总和，可以使用find列出文件（使用{{使用xargs将此列表作为参数传递给wc-l。

-print

编辑以解决Robert Gamble的评论（谢谢）：如果文件名中有空格或换行符（！），则必须使用find . -name "*.c" -print | xargs wc -l选项而不是-print0和-print以便使用以null结尾的字符串交换文件名列表。

xargs -null

Unix理念是让工具只做一件事，做得好。

Answer 2

如果你想要一个代码高尔夫的答案：

grep '' -R . | wc -l

单独使用wc -l的问题是它不能很好地下降，而oneliner使用

find . -exec wc -l {} \;

不会给你一个总行数，因为它为每个文件运行wc一次，（loL！）和

find . -exec wc -l {} +

一旦找到命中参数的~200k ¹ ^， ²字符参数限制，而是调用wc 多次次，就会感到困惑，每次只给你一个部分的总结。

此外，上面的grep技巧在遇到二进制文件时不会向输出添加超过1行，这可能是有利的。

对于1个额外命令字符的开销，您可以完全忽略二进制文件：

 grep '' -IR . | wc -l

如果您想在二进制文件上运行行计数

 grep '' -aR . | wc -l

关于限制的脚注：

对于字符串大小限制或令牌数限制，文档有点模糊。

cd /usr/include;
find -type f -exec perl -e 'printf qq[%s => %s\n], scalar @ARGV, length join q[ ], @ARGV' {} + 
# 4066 => 130974
# 3399 => 130955
# 3155 => 130978
# 2762 => 130991
# 3923 => 130959
# 3642 => 130989
# 4145 => 130993
# 4382 => 130989
# 4406 => 130973
# 4190 => 131000
# 4603 => 130988
# 3060 => 95435

这意味着它很容易变成块状。

Answer 3

我认为你可能会遇到xargs

find -name '*php' | xargs cat | wc -l

chromakode的方法给出了相同的结果，但速度要慢得多。如果你使用xargs，你的 cat 和 wc 可以在 find 开始查找时立即启动。

Linux: xargs vs. exec {}

的好解释

Answer 4

尝试使用find命令，默认情况下会递归目录：

find . -type f -execdir cat {} \; | wc -l

Answer 5

正确的方法是：

find . -name "*.c" -print0 | xargs -0 cat | wc -l

您必须使用-print0，因为Unix文件名中只有两个无效字符：空字节和“/”（斜杠）。因此，例如“xxx \ npasswd”是有效名称。实际上，你更有可能遇到带有空格的名字。上面的命令会将每个单词计为一个单独的文件。

您可能还想使用“-type f”而不是-name来限制搜索到文件。

Answer 6

如果你可以使用相对较新的GNU工具，包括Bash，那么在上述解决方案中使用cat或grep是浪费的：

wc -l --files0-from=<(find . -name \*.c -print0)

它处理带有空格，任意递归和任意数量匹配文件的文件名，即使它们超出了命令行长度限制。

Answer 7

我喜欢在项目目录中的所有文件上一起使用查找和头作为“递归cat”，例如：

find . -name "*rb" -print0 | xargs -0 head -10000

优点是head会添加文件名和路径：

==> ./recipes/default.rb <==
DOWNLOAD_DIR = '/tmp/downloads'
MYSQL_DOWNLOAD_URL = 'http://cdn.mysql.com/Downloads/MySQL-5.6/mysql-5.6.10-debian6.0-x86_64.deb'
MYSQL_DOWNLOAD_FILE = "#{DOWNLOAD_DIR}/mysql-5.6.10-debian6.0-x86_64.deb"

package "mysql-server-5.5"
...

==> ./templates/default/my.cnf.erb <==
#
# The MySQL database server configuration file.
#
...

==> ./templates/default/mysql56.sh.erb <==
PATH=/opt/mysql/server-5.6/bin:$PATH

有关此处的完整示例，请参阅我的博文：

http://haildata.net/2013/04/using-cat-recursively-with-nicely-formatted-output-including-headers/

注意我使用'head -10000'，显然如果我有超过10,000行的文件，这将截断输出...但是我可以使用head 100000但是对于“非正式项目/目录浏览”这种方法效果很好为了我。

Answer 8

如果您只想为每个文件生成总行数而不是行数，请执行以下操作：

find . -type f -exec wc -l {} \; | awk '{total += $1} END{print total}'

运作良好。这使您无需在脚本中进行进一步的文本过滤。

Answer 9

wc -cl `find . -name "*.php" -type f`

Answer 10

这是一个Bash脚本，用于计算项目中的代码行数。它以递归方式遍历源树，并且不包括使用＆＃34; //＆＃34;的空白行和单行注释。

# $excluded is a regex for paths to exclude from line counting
excluded="spec\|node_modules\|README\|lib\|docs\|csv\|XLS\|json\|png"

countLines(){
  # $total is the total lines of code counted
  total=0
  # -mindepth exclues the current directory (".")
  for file in `find . -mindepth 1 -name "*.*" |grep -v "$excluded"`; do
    # First sed: only count lines of code that are not commented with //
    # Second sed: don't count blank lines
    # $numLines is the lines of code
    numLines=`cat $file | sed '/\/\//d' | sed '/^\s*$/d' | wc -l`
    total=$(($total + $numLines))
    echo "  " $numLines $file
  done
  echo "  " $total in total
}

echo Source code files:
countLines
echo Unit tests:
cd spec
countLines

以下是my project的输出结果：

Source code files:
   2 ./buildDocs.sh
   24 ./countLines.sh
   15 ./css/dashboard.css
   53 ./data/un_population/provenance/preprocess.js
   19 ./index.html
   5 ./server/server.js
   2 ./server/startServer.sh
   24 ./SpecRunner.html
   34 ./src/computeLayout.js
   60 ./src/configDiff.js
   18 ./src/dashboardMirror.js
   37 ./src/dashboardScaffold.js
   14 ./src/data.js
   68 ./src/dummyVis.js
   27 ./src/layout.js
   28 ./src/links.js
   5 ./src/main.js
   52 ./src/processActions.js
   86 ./src/timeline.js
   73 ./src/udc.js
   18 ./src/wire.js
   664 in total
Unit tests:
   230 ./ComputeLayoutSpec.js
   134 ./ConfigDiffSpec.js
   134 ./ProcessActionsSpec.js
   84 ./UDCSpec.js
   149 ./WireSpec.js
   731 in total

享受！ - Curran

Answer 11

find . -name "*.h" -print | xargs wc -l

如何计算代码行，包括子目录

11 个答案: