Question

我创建了一个Applet搜索工具，我在其中提供了一个字符串作为输入，并在指定的文件或文件夹中找到该字符串。
我已经完成了这个，但我对它的表现不满意这个过程花了太多时间来回应我决定进行分析以查看发生了什么，我注意到方法scanner .hasNextLine（）占用了大部分时间。
虽然这对我的程序来说是非常重要的方法，因为我必须读取所有行并找到该字符串，是否有其他方法可以提高其性能并缩短执行时间

以下是我使用此方法的代码....

fw = new FileWriter("filePath", true);
        bw = new BufferedWriter(fw);

        for (File file : filenames) {
            if(file.isHidden())
                continue;

                if (!file.isDirectory()) {
                Scanner scanner = new Scanner(file);
                int cnt = 0;
                while (scanner.hasNextLine()) {
                    String line = scanner.nextLine();
                    if(!exactMatch)
                    {
                        if(!caseSensitive)
                        {
                            if (line.toLowerCase().contains(searchString.toLowerCase())) {
                                // System.out.println(line);
                                cnt += StringUtils.countMatches(line.toLowerCase(),
                                        searchString.toLowerCase());
                            }
                        }
                        else
                        {
                            if (line.contains(searchString)) {
                                // System.out.println(line);
                                cnt += StringUtils.countMatches(line,
                                        searchString);
                            }
                        }
                    }

是的，toLowerCase（）的方法也花费了更多的时间。

我已更改了我的代码，现在我使用BufferedReader代替Scanner Alex 和 Nrj 建议我找到了一个很好的改善我的申请表现。
它现在正处理其早期版本的三分之一时间感谢所有回复.....

Answer 1

根据您的问题，我检查了Scanner的代码，我认为您是对的。它没有针对大数据进行优化。我建议你使用包裹BufferedReader InputStreamReader的简单FileInputStream：

BufferedReader r = new BufferedReader(new InputStreamReader(new FileInputStream(fileName)))

然后逐行阅读：

r.readLine()

如果这还不够，请尝试阅读大量的行，然后处理它们。

关于toLowerCase()，您可以尝试使用正则表达式。好处是您不必每次都更改行的大小写。缺点是在简单情况下，正则表达式比常规字符串比较慢一点。

Answer 2

我建议重新设计你的解决方案并使用像Lucene这样的东西来搜索你。您可以更有效地使用Lucene索引和搜索文件，可以在此处找到有关如何使用文本文件的教程：http://www.avajava.com/tutorials/lessons/how-do-i-use-lucene-to-index-and-search-text-files.html

Answer 3

（只有很小的优化，以回应上面的评论。）

            if(!caseSensitive)
            {
                searchString = searchString.toLowerCase();
            }
            while (true) {
                String line = bufferedReader.readLine();
                if (line == null)
                    break;
                if(!caseSensitive)
                {
                    line = line.toLowerCase();
                }
                if(!exactMatch)
                {
                    if (line.contains(searchString)) {
                        // System.out.println(line);
                        cnt += StringUtils.countMatches(line,
                                searchString);
                    }
                }

Answer 4

尝试使用BufferedReader
利用线程。您可以并行搜索文件，这样可以缩短搜索时间。

Answer 5

我不会使用Java在文件系统中搜索字符串的匹配项。而是从Java调用本机算法。我会使用类似这样的东西从Java调用grep：

ProcessBuilder pb = new ProcessBuilder("grep", "-r", "foo");
pb.directory(new File("myDir"));
Process p = pb.start();
InputStream in = p.getInputStream();
//Do whatever you prefer with the stream

提高Java程序的性能

5 个答案: