Question

我的系统是带有超线程的i5-Dual核心。 Windows向我展示了4个处理器。当我一次由一个线程运行一个优化的cpu绑定任务时，其服务时间总是显示为35ms。但是，当我将2个任务同时切换到2个线程时，他们的服务时间显示为70毫秒。我想问一下，我的系统有4个处理器，那么为什么在2个线程运行teir任务时服务时间是70，而2个线程应该在2个处理器上运行而没有任何调度开销。代码如下。

CPU绑定任务如下。

import java.math.BigInteger;

public class CpuBoundJob  implements Runnable {

    public void run() {

         BigInteger factValue = BigInteger.ONE;
            long t1=System.nanoTime();

            for ( int i = 2; i <= 2000; i++){
              factValue = factValue.multiply(BigInteger.valueOf(i));
            }
        long t2=System.nanoTime();

        System.out.println("Service Time(ms)="+((double)(t2-t1)/1000000));
    }

}

运行任务的线程如下。

public class TaskRunner extends Thread {
    CpuBoundJob job=new CpuBoundJob();
    public void run(){

        job.run();
    }
}

最后，主要课程如下。

public class Test2 {
int numberOfThreads=100;//warmup code for JIT
public Test2(){
    for(int i=1;i<=numberOfThreads;i++){//warmup code for JIT
        TaskRunner t=new TaskRunner();
        t.start();
        }
    try{
    Thread.sleep(5000);// wait a little bit
    }catch(Exception e){}
    System.out.println("Warmed up completed! now start benchmarking");
    System.out.println("First run single thread at a time");

    try{//wait for the thread to complete
        Thread.sleep(5000);
        }catch(Exception e){}
        //run only one thread at a time
            TaskRunner t1=new TaskRunner();
            t1.start();


    try{//wait for the thread to complete
        Thread.sleep(5000);
        }catch(Exception e){}

    //Now run 2 threads simultanously at a time

    System.out.println("Now run 3 thread at a time");


        for(int i=1;i<=3;i++){//run 2 thread at a time
            TaskRunner t2=new TaskRunner();
            t2.start();


            }


}
public static void main(String[] args) {
    new Test2();    
    }

最终输出：

热身完成！现在开始基准测试首先运行单线程 a time Service Time（ms）= 5.829112现在一次运行2个线程Service 时间（毫秒）= 6.518721服务时间（毫秒）= 10.364269服务时间（ms）= 10.272689

Answer 1

我在各种场景中计时，并且稍微修改了一个任务，一个线程的时间约为45毫秒，两个线程的时间约为60毫秒。因此，即使在这个例子中，一秒钟内，一个线程可以完成大约22个任务，但是两个线程可以完成33个任务。

但是，如果您运行的任务不会对垃圾收集器造成严重影响，您应该会看到预期的性能提升：两个线程完成两倍的任务。这是我的测试程序版本。

请注意，我对您的任务进行了一次重大更改（DirtyTask）：n始终为0，因为您将Math.random()的结果转换为int（其中为零），然后乘以13。

然后我添加了一个CleanTask，它不会为垃圾收集器生成任何新对象来处理。请在您的机器上测试并报告结果。在我的，我得到了这个：

Testing "clean" task.
Average task time: one thread = 46 ms; two threads = 45 ms
Testing "dirty" task.
Average task time: one thread = 41 ms; two threads = 62 ms

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.ThreadLocalRandom;
import java.util.concurrent.TimeUnit;
import java.util.function.Supplier;

final class Parallels
{

  private static final int RUNS = 10;

  public static void main(String... argv)
    throws Exception
  {
    System.out.println("Testing \"clean\" task.");
    flavor(CleanTask::new);
    System.out.println("Testing \"dirty\" task.");
    flavor(DirtyTask::new);
  }

  private static void flavor(Supplier<Callable<Long>> tasks)
    throws InterruptedException, ExecutionException
  {
    ExecutorService warmup = Executors.newFixedThreadPool(100);
    for (int i = 0; i < 100; ++i)
      warmup.submit(tasks.get());
    warmup.shutdown();
    warmup.awaitTermination(1, TimeUnit.DAYS);
    ExecutorService workers = Executors.newFixedThreadPool(2);
    long t1 = test(1, tasks, workers);
    long t2 = test(2, tasks, workers);
    System.out.printf("Average task time: one thread = %d ms; two threads = %d ms%n", t1 / (1 * RUNS), t2 / (2 * RUNS));
    workers.shutdown();
  }

  private static long test(int n, Supplier<Callable<Long>> tasks, ExecutorService workers)
    throws InterruptedException, ExecutionException
  {
    long sum = 0;
    for (int i = 0; i < RUNS; ++i) {
      List<Callable<Long>> batch = new ArrayList<>(n);
      for (int t = 0; t < n; ++t)
        batch.add(tasks.get());
      List<Future<Long>> times = workers.invokeAll(batch);
      for (Future<Long> f : times)
        sum += f.get();
    }
    return TimeUnit.NANOSECONDS.toMillis(sum);
  }

  /**
   * Do something on the CPU without creating any garbage, and return the 
   * elapsed time.
   */
  private static class CleanTask
    implements Callable<Long>
  {
    @Override
    public Long call()
    {
      long time = System.nanoTime();
      long x = 0;
      for (int i = 0; i < 15_000_000; i++)
        x ^= ThreadLocalRandom.current().nextLong();
      if (x == 0)
        throw new IllegalStateException();
      return System.nanoTime() - time;
    }
  }

  /**
   * Do something on the CPU that creates a lot of garbage, and return the 
   * elapsed time.
   */
  private static class DirtyTask
    implements Callable<Long>
  {
    @Override
    public Long call()
    {
      long time = System.nanoTime();
      String s = "";
      for (int i = 0; i < 10_000; i++)
        s += (int) (ThreadLocalRandom.current().nextDouble() * 13);
      if (s.length() == 10_000)
        throw new IllegalStateException();
      return System.nanoTime() - time;
    }
  }

}

Answer 2

    for(int i=0;i<10000;i++)
    {
        int n=(int)Math.random()*13;
        s+=name.valueOf(n);
        //s+="*";
    }

这段代码围绕资源进行紧密调整，一次只能由一个线程访问。因此，每个线程只需要等待另一个线程释放随机数生成器，以便它可以访问它。

正如docs for Math.random所说：

首次调用此方法时，它会创建一个新的伪随机数生成器，就像表达式
一样
new java.util.Random（）

此新的伪随机数生成器此后用于对此方法的所有调用，并且在其他任何地方都没有使用。

此方法已正确同步，以允许多个线程正确使用。但是，如果许多线程需要以很高的速率生成伪随机数，它可能会减少每个线程争用自己的伪随机数生成器。

服务时间与线程数成正比

2 个答案: