cpu没有在多线程中充分利用

时间:2013-10-25 22:47:58

标签: java

我正在为推荐引擎编写代码,该引擎使用并行线程进行计算。运行时,我在8核CPU上获得的最大CPU使用率为265-300%(使用top命令)。我没有得到的是为什么即使近50%的CPU空闲也不使用完整的CPU。使用并行性的伪代码部分是:

getRecoFromCandidates(){
    t=new Thread(new KNN(uid,profile,candidates));// does knn using closest neighbors
    rec=RecommendationsFromSet(NB_RECOMMENDATIONS,uid,candidatesI,val);// finds recommendations
    return rec;
}



RecommendationsFromSet() {
      Thread worker = new Thread(new storiesClickProcess(nbRec,candidatesI,likedSet));
      worker.start();
      threads.add(worker);
    }
    int running = 0;
    do {
      running = 0;
      for (Thread thread : threads) {
        if (thread.isAlive()) {
          running++;
        }
      }
    } while (running > 0);

    if(val==1){
    List<Story> list1 = new ArrayList<Story>();
    for(Integer sid: storiesClicks.keySet()){
        list1.add(new Story(sid,storiesClicks.get(sid)));
    }
    Collections.sort(list1,new StoriesList());
    for(i=0;i<nbRec;i++){
        rec.add(list1.get(i).getSid());
    }
    }



    return rec;
}


   parGetClosestNeighbor() {
      Thread worker = new Thread(new formNetworkProcess(nbFriends,networkScore[i],profile,candidatesI,candidatesISet) );
      worker.start();
      threads.add(worker);
    }
    int running = 0;
    do {
      running = 0;
      for (Thread thread : threads) {
        if (thread.isAlive()) {
          running++;
        }
      }
    } while (running > 0);

    List<Network> list = new ArrayList<Network>();

    for(i=0;i<TASK_NUM;i++){
     for(Integer sid: networkScore[i].keySet()){
        list.add(new Network(sid,networkScore[i].get(sid)));

    }
    }

    Collections.sort(list,new NetworkList());
    for(i=0;i<nbFriends;i++){
        friends.add(list.get(i).getUid());
    }

    synchronized(network.get(uid)) {
        network.get(uid).clear();
        for( i=0;i<friends.size();i++){
            network.get(uid).add(friends.get(i));
        }

    }


    return friends;
}

1 个答案:

答案 0 :(得分:0)

如果您有多核CPU,则top不是检查CPU使用率的最佳命令。使用sar

sar -P ALL 1 1000

这将每1秒打印一次所有核心的使用1000次

此外,我确实在您的代码中看到了synchronized关键字。这可能会导致争用。您可以使用vmstat或此命令验证

cat / proc / PID / status