我不应该看到单线程与多线程websocketpp服务器之间CPU使用率的差异吗?

时间:2017-12-18 04:25:23

标签: c++ multithreading boost-asio websocket++

我正在使用我配置的多线程websocketpp服务器:

Server::Server(int ep) {
    using websocketpp::lib::placeholders::_1;
    using websocketpp::lib::placeholders::_2;
    using websocketpp::lib::bind;

    Server::wspp_server.clear_access_channels(websocketpp::log::alevel::all);

    Server::wspp_server.init_asio();

    Server::wspp_server.set_open_handler(bind(&Server::on_open, this, _1));;
    Server::wspp_server.set_close_handler(bind(&Server::on_close, this, _1));
    Server::wspp_server.set_message_handler(bind(&Server::on_message, this, _1, _2));

    try {
        Server::wspp_server.listen(ep);
    } catch (const websocketpp::exception &e){
        std::cout << "Error in Server::Server(int): " << e.what() << std::endl;
    }
    Server::wspp_server.start_accept();
}

void Server::run(int threadCount) {
    boost::thread_group tg;

    for (int i = 0; i < threadCount; i++) {
        tg.add_thread(new boost::thread(
            &websocketpp::server<websocketpp::config::asio>::run,
            &Server::wspp_server));
        std::cout << "Spawning thread " << (i + 1) << std::endl;
    }

    tg.join_all();
}

void Server::updateClients() {
    /*
       run updates
    */
    for (websocketpp::connection_hdl hdl : Server::conns) {
        try {
            std::string message = "personalized message for this client from the ran update above";
            wspp_server.send(hdl, message, websocketpp::frame::opcode::text);
        } catch (const websocketpp::exception &e) {
            std::cout << "Error in Server::updateClients(): " << e.what() << std::endl;
        }
    }
}

void Server::on_open(websocketpp::connection_hdl hdl) {
    boost::lock_guard<boost::shared_mutex> lock(Server::conns_mutex);
    Server::conns.insert(hdl);

    //do stuff


    //when the first client connects, start the update routine
    if (conns.size() == 1) {
        Server::run = true;
        bool *run = &(Server::run);
        std::thread([run] () {
            while (*run) {
                auto nextTime = std::chrono::steady_clock::now() + std::chrono::milliseconds(15);
                Server::updateClients();
                std::this_thread::sleep_until(nextTime);
            }
        }).detach();
    }
}

void Server::on_close(websocketpp::connection_hdl hdl) {
    boost::lock_guard<boost::shared_mutex> lock(Server::conns_mutex);
    Server::conns.erase(hdl);

    //do stuff

    //stop the update loop when all clients are gone
    if (conns.size() < 1)
        Server::run = false;
}

void Server::on_message(
        websocketpp::connection_hdl hdl,
        websocketpp::server<websocketpp::config::asio>::message_ptr msg) {
    boost::lock_guard<boost::shared_mutex> lock(Server::conns_mutex);

    //do stuff
}

我用:

启动服务器
int port = 9000;
Server server(port);
server.run(/* number of threads */);

添加连接时唯一重要的区别在于消息发送[wssp.send(...)]。越来越多的客户并没有真正为内部计算添加任何东西。它只是增加的消息量。

我的问题是,无论我使用一个还是多个线程,CPU使用率似乎都没那么大。

使用server.run(1)server.run(4)(均在4核CPU专用服务器上)启动服务器并不重要。对于类似的负载,CPU使用率图表显示大致相同的百分比。我希望4个线程并行运行时使用率会降低。我想错了吗?

在某些时候,我感觉到并行性确实适用于听力部分而不是发射。因此,我尝试将send括在一个新线程中(我将其分离),因此它独立于需要它的序列,但它并没有改变图形上的任何内容。

我不应该看到CPU产生的工作有什么不同吗?否则,我做错了什么?为了强制从不同的线程发出消息,我还缺少另一个步骤吗?

1 个答案:

答案 0 :(得分:1)

"My problem is that the CPU usage doesn't seem to be that much different whether I use 1 or more threads."

That's not a problem. That's a fact. It just means that the whole thing isn't CPU bound. Which should be quite obvious, since it's network IO. In fact, high-performance servers often dedicate only 1 thread to all IO tasks, for this reason.

"I was expecting the usage to be lower with 4 threads running in parallel. Am I thinking of this the wrong way?"

Yes, it seems to. You don't expect to pay less if you split the bill 4 ways either.

In fact, much like at the diner, you often end up paying more due the overhead of splitting the load (cost/tasks). Unless you require more CPU capacity/lower reaction times than a single thread can deliver, a single IO thread is (obviously) more efficient because there is no scheduling overhead and/or context switch penalty.

Another mental exercise:

  • if you run 100 threads, the processor will schedule them all across your available cores, in the optimal case
  • Likewise, if there are other processes running on your system (which there, obviously, always are) then the processor might schedule your 4 threads all on the same logical core. Do you expect the CPU load to be lower? Why? (Hint: of course not).

Background: What is the difference between concurrency, parallelism and asynchronous methods?