慢文件查找大目录

时间:2013-11-28 07:21:21

标签: java file-io directory operating-system

如果我从一个有五十万个文件的目录创建一个文件的文件对象,那么该文件的查找和打开会慢吗?如果是,那么背后的原因是什么?

2 个答案:

答案 0 :(得分:0)

当你在成千上万个文件的order上有某些内容时,性能通常会开始下降,所以是的,50万个文件可能会杀死你的计算机 - 这似乎是一个坏主意。

答案 1 :(得分:0)

我在Linux 2.6.32上使用Java 1.6(避免了JIT编译器噪音)在文件打开和目录列表上进行了一些测量。打开一个随机文件应该是每this个O(logN),但没有可测量的减慢到100万个文件:

Opened random file in /tmp/fubar.100 in 0 ms
Last modified at 1385629306000
Opened random file in /tmp/fubar.1000 in 0 ms
Last modified at 1385631078000
Opened random file in /tmp/fubar.10000 in 0 ms
Last modified at 1385631054000
Opened random file in /tmp/fubar.100000 in 0 ms
Last modified at 1385630478000
Opened random file in /tmp/fubar.1000000 in 0 ms
Last modified at 1385632681000

File.listFiles()的性能似乎是O(n):

Listed 104 files in /tmp/fubar.100 in 2 ms
Listed 1001 files in /tmp/fubar.1000 in 9 ms (5x)
Listed 10001 files in /tmp/fubar.10000 in 19 ms (2x) 
Listed 100006 files in /tmp/fubar.100000 in 186 ms (10x)
Listed 1000002 files in /tmp/fubar.1000000 in 1909 ms (10x)

strace显示getdents()被重复调用O(n)次:

$ grep getdents err.100|wc
     28    5006   72926
$ grep getdents err.1000|wc
     33   44514  669558
$ grep getdents err.10000|wc
    147  441327 6765305
$ grep getdents err.100000|wc
   1213 4409107 68693705
$ grep getdents err.1000000|wc
  11987 44085454 701243406