Question

我有一个包含超过60000个文件的目录。如何在不使用find | head -n或ls | head -n解决方案的情况下仅获取其中的N个，因为find和ls读取此文件列表需要花费太多时间。 ls和find是否有任何配置或是否有其他程序可以帮助安全起来？

Answer 1

您可以在C中编写自己的简单实用程序。

#include <stdio.h>
#include <sys/types.h>
#include <dirent.h>

int main(int argc, char **argv) {
  DIR *dir;
  struct dirent *ent;
  int i = 0, n = 0;
  n = atoi(argv[2]);
  dir = opendir(argv[1]);
  while ((ent = readdir(dir)) != NULL) {
    if (strcmp(ent->d_name, ".") == 0 ||
        strcmp(ent->d_name, "..") == 0)
      continue;
    if (i++ >= n) break;
    printf("%s\n", ent->d_name);
  }
  closedir(dir);
  return 0;
}

这只是一个快速而肮脏的初稿，但你明白了。

Answer 2

为了它的价值：

# Create 60000 files
sh$ for i in {0..100}; do
    for j in {0..600}; do
        touch $(printf "%05d" $(($i+$j*100)));
    done;
done

在Linux Debian Wheezy x86_64 w / ext4文件系统上：

sh$ time bash -c 'ls | head -n 50000 | tail -10'
49990
49991
49992
49993
49994
49995
49996
49997
49998
49999

real    0m0.248s
user    0m0.212s
sys 0m0.024s

sh$ time bash -c 'ls -f | head -n 50000 | tail -10'
27235
02491
55530
44435
24255
47247
16033
45447
18434
35303

real    0m0.051s
user    0m0.016s
sys 0m0.028s

sh$ time bash -c 'find | head -n 50000 | tail -10'
./02491
./55530
./44435
./24255
./47247
./16033
./45447
./18434
./35303
./07658

real    0m0.051s
user    0m0.024s
sys 0m0.024s

sh$ time bash -c 'ls -f | sed -n 49990,50000p'
30950
27235
02491
55530
44435
24255
47247
16033
45447
18434
35303

real    0m0.046s
user    0m0.032s
sys 0m0.016s

当然，以下两个更快，因为它们只接受第一个条目（并且一旦需要＆＃34，它们就会用损坏的管道中断配对过程;行＆＃34;已阅读）：

sh$ time bash -c 'ls -f | sed 1000q >/dev/null'

real    0m0.008s
user    0m0.004s
sys 0m0.000s

sh$ time bash -c 'ls -f | head -1000>/dev/null'

real    0m0.008s
user    0m0.000s
sys 0m0.004s

有趣的是（？）sed我们将时间花在用户空间过程中，而head则是在sys中。几次运行后，结果是一致的......

Answer 3

您可以将sed与q：

一起使用

find ... | sed 10q  ## Prints 1st to 10th line.

这会使sed在第10行之后退出，这可能会使find更快地结束其功能。

另一种方法是使用awk，但sed仍然更有效：

find ... | awk 'NR==11{exit}1'

或者

find ... | awk '1;NR==10{exit}'

Answer 4

ls -f directory | sed -n 1,10p       # print line 1-10

ls的选项：

-f：不排序

如何从目录中读取第一个第n个文件（pleaso不是“head -n解决方案”）？

4 个答案: