为什么我的线程在不同时间执行相同数量的工作?

时间:2016-03-11 12:55:44

标签: java multithreading performance-testing

我在Java中有一个简单的多线程应用程序,如下所示:

class MyThreads extends Thread{
    public void run() {
        {
            // some thread initializations

            // every thread reads 2 files (its own files, 
            // so node 0 will read A0.txt and B0.txt 
            // and node 1 will read A1.txt and B1.txt)
            // he files have sizes between 10-20MB. 
            // A's files contain different information for different nodes (A0.txt != A1.txt),
            // but B's files are the same(B0.txt has 
            // the same info as B1.txt). This is just a scenario.

            // it stores the data that was 
            // read before in the memory.
            // Again, i know B can be shared since 
            // it has the same info in both threads, but it's not.
        }

        {
            // simple computation on the data retrieved 
            // (addition, multiplication, etc)
            // I assume there is no need to synchronize 
            // the threads since they apply operations on their own data.
            // Here, every thread executes the same number of operations
        }

        {
            // writing the results on different files. This phase in unimportant.
        }
    }

    public static void main(String args[]) {
        // start 4 threads
    }
}

在测试初始化​​部分的性能时,计算部分我得到了这些奇怪的结果:

2016-03-11-NodeThread:1 time[2318] tag[initialization]
2016-03-11-NodeThread:0 time[2379] tag[initialization]
2016-03-11-NodeThread:2 time[2474] tag[initialization]
2016-03-11-NodeThread:3 time[2481] tag[initialization]
2016-03-11-NodeThread:2 time[30ms] tag[computation]
2016-03-11-NodeThread:1 time[6ms] tag[computation]
2016-03-11-NodeThread:3 time[7ms] tag[computation]
2016-03-11-NodeThread:0 time[6ms] tag[computation]

正如可以看到NodeThread:2的计算耗时30ms,但其他节点耗时不到10 ms。

虽然在初始化和计算之间插入障碍后,我得到了很好的结果:

2016-03-11-NodeThread:1 time[2318] tag[initialization]
2016-03-11-NodeThread:0 time[2379] tag[initialization]
2016-03-11-NodeThread:2 time[2474] tag[initialization]
2016-03-11-NodeThread:3 time[2481] tag[initialization]
2016-03-11-NodeThread:2 time[30ms] tag[computation]
2016-03-11-NodeThread:1 time[33ms] tag[computation]
2016-03-11-NodeThread:3 time[29ms] tag[computation]
2016-03-11-NodeThread:0 time[31ms] tag[computation]

我的问题是:如果线程根本不进行通信,它们会从磁盘的不同部分读取,并且它们执行相同数量的计算,为什么需要在计算之前同步它们?我的猜测是涉及缓存,但我无法解释原因。

NB。我测试代码的机器有4个内核,没有其他CPU消耗进程在运行。为了测量我使用perf4j这样的时间。

    class MyThreads extends Thread{
        public void run() {
            {
                StopWatch stopWatch = new Log4JStopWatch();
                // some thread initializations

                // every thread reads 2 files (its own files,
                // so node 0 will read A0.txt and B0.txt
                // and node 1 will read A1.txt and B1.txt)
                // he files have sizes between 10-20MB.
                // A's files contain different information for different nodes (A0.txt != A1.txt),
                // but B's files are the same(B0.txt has
                // the same info as B1.txt). This is just a scenario.

                // it stores the data that was
                // read before in the memory.
                // Again, i know B can be shared since
                // it has the same info in both threads, but it's not.
                stopWatch.stop("initialization");
// barrier
            }

            {
                StopWatch stopWatch = new Log4JStopWatch();
                // simple computation on the data retrieved
                // (addition, multiplication, etc)
                // I assume there is no need to synchronize
                // the threads since they apply operations on their own data.
                // Here, every thread executes the same number of operations
                stopWatch.stop("computation");
            }

            {
                // writing the results on different files. This phase in unimportant.
            }
        }

        public static void main(String args[]) {
            // start 4 threads
        }
    }

1 个答案:

答案 0 :(得分:1)

我只能猜测,因为有更多的细节需要确定,但可能发生的事情是你的第一个线程经常执行一些代码,以便编译并可能通过Hotspot编译器和其他神奇的东西构建来优化在你的JVM中。

您的同步尝试可能会阻止这种情况发生,可能是因为线程在编译发生之前完成了计算,因为它们现在大约在同一时间启动。