Question

请看以下链接：

http://snippetsofjosh.wordpress.com/tag/advantages-and-disadvantages-of-arraylist/

这就是为什么我总是喜欢使用Arrays而不是（Array）列表的原因之一。不过，这让我想到了内存管理和速度。

因此我得出了以下问题：

当您不知道文件大小（/条目数）时，从文件存储数据的最佳方法是什么（其中 best 已定义）作为'最少的计算时间'）

下面，我将介绍3种不同的方法，我想知道哪种方法最好，为什么。为了清楚问题，我们假设我必须得到一个数组。另外，假设我们的.txt文件中的每一行只有一个条目（/一个字符串）。另外，为了限制问题的范围，我将此问题仅限于Java。

假设我们要从名为words.txt的文件中检索以下信息：

Hello
I 
am
a
test 
file

方法1 - 双重且危险

File read = new File("words.txt");
Scanner in = new Scanner(read);

int counter = 0;

while (in.hasNextLine())
{
    in.nextLine();
    counter++;
}

String[] data = new String[counter];

in = new Scanner(read);

int i = 0;

while (in.hasNextLine())
{
    data[i] = in.nextLine();
    i++;
}

方法2 - 清除但多余

File read = new File("words.txt");
Scanner in = new Scanner(read);

ArrayList<String> temporary = new ArrayList<String>();

while (in.hasNextLine())
{
    temporary.add(in.nextLine());
}

String[] data = new String[temporary.size()];

for (int i = 0; i < temporary.size(); i++)
{
    data[i] = temporary.get(i);
}

方法3 - 简短但严格

File read = new File("words.txt");
FileReader reader = new FileReader(read);

String content = null;

char[] chars = new char[(int) read.length()];
reader.read(chars);
content = new String(chars);

String[] data = content.split(System.getProperty("line.separator"));

reader.close();

如果您有其他方式（甚至更好），请在下方提供。此外，请随时根据需要调整我的代码。

Answer:

在数组中存储数据的最快方法是以下方法：

File read = new File("words.txt");
Scanner in = new Scanner(read);

ArrayList<String> temporary = new ArrayList<String>();

while (in.hasNextLine()) {
    temporary.add(in.nextLine());
}

String[] data = temporary.toArray(new String[temporary.size()]);

对于Java 7 +：

Path loc = Paths.get(URI.create("file:///Users/joe/FileTest.txt"));
List<String> lines = Files.readAllLines(loc, Charset.defaultCharset());
String[] array = lines.toArray(new String[lines.size()]);

Answer 1

我认为这里的最佳意味着更快。

我会使用方法2，但使用the Collection interface提供的方法创建数组：

String[] array = temporary.toArray(new String[temporary.size()]);

甚至更简单（Java 7 +）：

List<String> lines = Files.readAllLines(file, charset);
String[] array = lines.toArray(new String[lines.size()]);

其他方法：

方法1执行两次传递，读取文件比调整arraylist的效率更高效
我不确定方法3是否更快

更新

为了完整起见，我已经使用上面修改过的method2运行microbenchmark并包含一次读取所有字节的附加方法（method4），创建一个字符串并拆分新线。结果（以毫微秒为单位）：

Benchmark   Mean 
method1     126.178
method2     59.679
method3     76.622
method4     75.293

编辑：

a larger 3MB file: LesMiserables.txt，结果一致：

Benchmark      Mean 
method1     608649.322
method2      34167.101
method3      63410.496
method4      65552.79

Answer 2

此处给出了与所有源代码的非常好的比较java_tip_how_read_files_quickly

<强>要点：

要获得最佳的Java读取性能，需要记住四件事：

通过一次读取一个数组来最小化I / O操作，而不是一次读取一个字节。一个8K字节的阵列是一个很好的大小。
最小化方法调用一次获取数据数组，而不是一次获取一个字节。使用数组索引以获取数组中的字节数。
如果您不需要线程安全，请最小化线程同步锁。要么做对线程安全类的方法调用较少，或者使用非线程安全的方法像FileChannel和MappedByteBuffer这样的类。
尽量减少数据复制 JVM / OS，内部缓冲区和应用程序阵列之间。使用 FileChannel具有内存映射，或直接或包装数组字节缓冲区。

希望有所帮助。

修改

我会这样做：

File read = new File("words.txt"); Scanner in = new Scanner(read); List<String> temporary = new LinkedList<String>(); while (in.hasNextLine()) { temporary.add(in.nextLine()); } String[] data = temporary.toArray(new String[temporary.size()]);

主要的区别是只读取一次数据（与其他2种方法相反）和在链表中添加非常便宜 +在所需的行上没有额外的操作（如拆分） - 不要使用arraylist这里

Answer 3

如果您正在从文件中读取数据，则瓶颈将是文件读取（IO）阶段。几乎在所有情况下，处理它所花费的时间都是微不足道的。做正确和安全的事情。首先，你做对了;然后你就快点。

如果您不知道文件的大小，则必须具有某种动态扩展的数据结构。 ArrayList是什么。您自己编写的代码不太可能比Java API的这一重要部分更有效或更正确。所以只需使用ArrayList：选项2。

Answer 4

我会使用guava

File file = new File("words.txt");
List<String> lines = Files.readLines(file, Charset.defaultCharset());
// If it really has to be an array:
String[] array = lines.toArray(new String[0]);

Answer 5

List<String> lines = Files.readAllLines(yourFile, charset);
String[] arr = lines.toArray(new String[lines.size()]);

在数组中读取文件数据的最快方法（Java）

Answer:

5 个答案: