在文本中计算逗号(多线程),我做得对吗?

时间:2016-10-09 09:00:46

标签: java multithreading performance

我使用5个线程计算文本中的逗号,方法是将文本分成5个相等的部分,然后让每个线程处理它自己的部分。我只是想知道我是否做得对:

public class CommaCounter implements Runnable {

    public static int commaCount = 0; // it's static so that all CommaCounter threads share the same variable

    private String text;
    private int startIndex;
    private int endIndex;

    public CommaCounter(String text, int startIndex, int endIndex) {
        this.text = text;
        this.startIndex = startIndex;
        this.endIndex = endIndex; 
    }

    @Override
    public void run() {

        for (int i = startIndex; i < endIndex; i++) {
            if(text.charAt(i) == ','){
                commaCount++; // is incrementing OKAY? is there danger of thread interference or memory consistency errors?
            }
        }
    }
}

主要方法:

public class Demo {

    public static void main(String[] args) throws MalformedURLException, IOException, InterruptedException {

        long startTime = System.currentTimeMillis();
        /*
            I'll spare the code that was here for retrieving the text from a URL
        */

        String text = stringBuilder.toString();

        Set<Thread> threadCollection = new HashSet<>();

        int threadCount = 5;
        int textPerThread = text.length() / threadCount;
        for (int i = 0; i < threadCount; i++) {
            int start = i * textPerThread;
            Thread t = new Thread(new CommaCounter(text, start, start + textPerThread));
            threadCollection.add(t);
            t.start();
        }

        for (Thread thread : threadCollection) {
            thread.join(); // joining each CommaCounter thread, so that the code after the for loop doesn't execute prematurely
        }

        long endTime = System.currentTimeMillis();
        System.out.println("Counting the commas with " + threadCount + " threads took: " + (endTime - startTime) + "ms");
        System.out.println("Comma count: " + CommaCounter.commaCount);
    }

}

主要是我担心增加commaCount是否正确完成,即是否存在线程干扰或内存一致性错误的危险。另外,我很奇怪为什么执行时间并不比用单个线程计算逗号时更好(这几乎相同)。

任何帮助将不胜感激!

2 个答案:

答案 0 :(得分:3)

绝对不行。您正在从多个线程访问静态变量。使用AtomicInteger或同步静态方法。

正如您的评论所述,确切地说:)

将其设为AtomicInteger并使用getAndIncrement或incrementAndGet方法,

或创建静态同步方法

或创建一个synchronized块,但在这种情况下,请确保同步的对象是相同的!因为这是关于CommaCounter类中的静态变量,可以是CommaCounter.class

答案 1 :(得分:1)

我猜这个问题与fork和join framework完全匹配

https://docs.oracle.com/javase/tutorial/essential/concurrency/forkjoin.html

由于它使用并行分治技术,它将继续拆分文本并将工作分配给新线程,同时使用“隐身”技术确保所有线程都忙。