Question

我想遍历特定文件夹中的所有文件，并在每个文件中计算特定单词。它应该通过线程池完成，只有在文件的总大小大于给定的filesizelimit时，才应创建每个线程。因此，在另一个类中，我一直在查找给定文件夹中的所有文件，并在其中调用forkJoinPoolInstance.submit(new JobFileTask(files));

我检查了stackoverflow上的其他一些线程，发现有关HashMaps中的合并，但是它不能解决我的问题。如果我使用下面编写的代码进行合并，则每个文件夹中用于特定键的所有文件中的所有值将为null。而且，如果我放mapToReturn.putAll(readFile(f));而不是此合并调用，则即使给定关键字出现的次数更多，每个文件也只会计算一个单词。我在这里想念什么？

public class JobFileTask extends RecursiveTask<Map<String, Integer>>
{
    private File[] files;

    public JobFileTask (File[] f)
    {
        this.files = f;
    }

    @Override
    protected Map<String, Integer> compute()
    {
        System.out.println("Computing... ");
        Map<String, Integer> mapToReturn = new ConcurrentHashMap<>();
        long currentSize = 0;
        ArrayList<File> newFiles = new ArrayList<>();
        ArrayList<File> otherFiles = new ArrayList<>();

        if (files.length == 0)
        {
            return mapToReturn;
        }

        for (File f: files)
        {
            currentSize += f.length();
            if (currentSize <= Main.getScanningSizeLimit())
            {
                newFiles.add(f);
                Map<String, Integer> temp = readFile(f);
                mapToReturn.entrySet()
                   .forEach(entry -> temp.merge(
                            entry.getKey(),
                            entry.getValue(),
                            (key, value) -> entry.getValue() + value));
            }
            else
            {
                ForkJoinTask<Map<String, Integer>> forkTask = new JobFileTask((File[]) newFiles.toArray(new File[newFiles.size()]));

                forkTask.fork();

                for (File fs: files)
                {
                    if (!newFiles.contains(fs))
                    {
                        otherFiles.add(fs);
                    }
                }

                JobFileTask callTask = new JobFileTask((File[]) otherFiles.toArray(new File[otherFiles.size()]));

                Map<String, Integer> forkResult = callTask.compute();

                Map<String, Integer> callResult = forkTask.join();

                mapToReturn.putAll(forkResult);
                mapToReturn.putAll(callResult);
                break;
            }
        }

        return mapToReturn;
    }

    public Map<String, Integer> readFile(File f)
    {
        Map<String, Integer> toReturn = new ConcurrentHashMap<>();
        try
        {
            String line = "";
            BufferedReader bufferedReader = new BufferedReader(new FileReader(f));
            while ((line = bufferedReader.readLine()) != null)
            {
                for (String k: Main.getKeywords())
                {
                    if (line.contains(k))
                    {
                        if (toReturn.containsKey(k))
                        {
                            toReturn.put(k, toReturn.get(k)+1);
                        }
                        else
                        {
                            toReturn.put(k, 1);
                        }
                    }
                }
            }
            bufferedReader.close();
        } 
        catch (Exception e)
        {
            e.printStackTrace();
        }
        return toReturn;
    }

在RecursiveTask中合并两个哈希图

0 个答案: