考虑以下两个过程:
sender.cpp:
#include <zhelpers.h>
...
zmq::context_t ctx(1);
...
void foo(int i)
{
zmq::socket_t sender(ctx, ZMQ_REQ);
sender.connect("tcp://hostname:5000");
std::stringstream ss;
ss <<"bar_" <<i;
std::string bar_i(std::move(ss.str());
s_sendmore(sender, "foo ");
(i != N) ? s_send(sender, bar, 0) : s_send(sender, "done", 0);
s_recv(sender);
}
int main()
{
for(int i=0; i<=100000; ++i)
foo(i);
return 0;
}
receiver.cpp
#include <zhelpers.h>
...
int main()
{
zmq::context_t ctx(1);
zmq::socket_t rcv(ctx, ZMQ_REP);
rcv.bind("tcp://*:5000");
std::string s1("");
std::string s2("");
while(s2 != "done")
{
s1 = std::move(s_recv(rcv));
s2 = std::move(s_recv(rcv));
std::cout <<"received: " <<s1 <<" " <<s2 <<"\n";
s_send(rcv, "ACK");
}
return 0;
}
让我们开始这两个过程。我期待的是,接收方进程将收到发件人发送给它的所有邮件,并打印出来:
foo bar_1
foo bar_2
...
等等,直到:
...
foo bar_100000
我希望它会毫无阻碍地完成这项工作。
我的问题是接收器始终坚持第28215次迭代(总是围绕这个数字!!!)并阻塞一分钟左右。然后它会进一步延伸到100000,但有时会再次粘住。我的问题当然是:为什么会发生这种情况?我该如何解决?
我试图将发件人放入&#39;在全局范围内的foo(。)内,然后它起作用:在这种情况下,所有打印输出从1到100000平稳且超快,没有任何阻塞(当然,在这种情况下,套接字不是每次都创建的foo(。)被称为)。但遗憾的是,在我的代码中,我无法做到这一点。
我想了解为什么会出现这种情况。
答案 0 :(得分:0)
首先,您的示例不可靠,因为它们无法编译。所以这里有一些应该接近你的意图并实际编译的exapmles
sender.cpp
#include <zmq.hpp>
#include <string>
#include <iostream>
#include <string>
void send(const std::string& msg)
{
// Prepare our context and socket
zmq::context_t context (1);
zmq::socket_t socket (context, ZMQ_REQ);
std::cout << "Connecting to receiver ..." << std::endl;
socket.connect ("tcp://localhost:5555");
zmq::message_t request (100);
memcpy (request.data (), msg.c_str(), 100);
std::cout << "Sending message " << msg << "..." << std::endl;
socket.send (request);
}
int main ()
{
for(int i = 0; i < 100000; ++i)
{
send(std::to_string(i));
}
send("done");
}
使用linke
g++ -std=c++11 -I/home/dev/cppzmq -I/home/dev/libzmq/include sender.cpp -lzmq -o sender
receiver.cpp
#include <zmq.hpp>
#include <string>
#include <cstring>
#include <iostream>
int main () {
// Prepare our context and socket
zmq::context_t context (1);
zmq::socket_t socket (context, ZMQ_REP);
socket.bind ("tcp://*:5555");
char buf[100] = {0};
while (std::string(buf).compare("done")) {
zmq::message_t request;
// Wait for next request from client
socket.recv (&request);
std::memcpy(buf, request.data(), 100);
std::cout << "Received message " << buf << std::endl;
// Send reply back to client
zmq::message_t reply (5);
memcpy (reply.data (), "Hello", 5);
socket.send (reply);
}
return 0;
}
使用
g++ -std=c++11 -I/home/dev/cppzmq -I/home/dev/libzmq/include receiver.cpp -lzmq -o receiver
启动进程时,一切似乎都正常,接收器上的输出正如预期的那样没有中断:
Received message 99996
Received message 99997
Received message 99998
Received message 99999
Received message done
但我的预期:看看netstat:
netstat
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 localhost:38345 localhost:5555 TIME_WAIT
tcp 0 0 localhost:46228 localhost:5555 TIME_WAIT
tcp 0 0 localhost:60309 localhost:5555 TIME_WAIT
tcp 0 0 localhost:46916 localhost:5555 TIME_WAIT
tcp 0 0 localhost:47600 localhost:5555 TIME_WAIT
tcp 0 0 localhost:54454 localhost:5555 TIME_WAIT
tcp 0 0 localhost:46409 localhost:5555 TIME_WAIT
tcp 0 0 localhost:51142 localhost:5555 TIME_WAIT
tcp 0 0 localhost:40355 localhost:5555 TIME_WAIT
tcp 0 0 localhost:40005 localhost:5555 TIME_WAIT
tcp 0 0 localhost:45614 localhost:5555 TIME_WAIT
tcp 0 0 localhost:48974 localhost:5555 TIME_WAIT
tcp 0 0 localhost:41427 localhost:5555 TIME_WAIT
tcp 0 0 localhost:58740 localhost:5555 TIME_WAIT
tcp 0 0 localhost:58754 localhost:5555 TIME_WAIT
tcp 0 0 localhost:60044 localhost:5555 TIME_WAIT
tcp 0 0 localhost:57478 localhost:5555 TIME_WAIT
tcp 0 0 localhost:50419 localhost:5555 TIME_WAIT
tcp 0 0 localhost:44361 localhost:5555 TIME_WAIT
tcp 0 0 localhost:37284 localhost:5555 TIME_WAIT
tcp 0 0 localhost:38662 localhost:5555 TIME_WAIT
tcp 0 0 localhost:45968 localhost:5555 TIME_WAIT
tcp 0 0 localhost:57407 localhost:5555 TIME_WAIT
tcp 0 0 localhost:59200 localhost:5555 TIME_WAIT
tcp 0 0 localhost:41292 localhost:5555 TIME_WAIT
tcp 0 0 localhost:55243 localhost:5555 TIME_WAIT
tcp 0 0 localhost:51489 localhost:5555 TIME_WAIT
tcp 0 0 localhost:48865 localhost:5555 TIME_WAIT
tcp 0 0 localhost:35491 localhost:5555 TIME_WAIT
...
一次运行后,我在TIME_WAIT状态下有超过20k(!)这样的套接字。这是因为socket
中void send(...)
的{{1}}范围可变。我不确切知道zmq在超出范围时销毁套接字时的作用,但我很确定它会在套接字的fd上调用sender.cpp
,这会使套接字处于TIME_WAIT状态。即使我的发送者和接收者运行顺利,我也不知道你的系统如何处理这么多的套接字。另外,我不知道你的close()
文件是做什么的。但我知道如果将套接字放在全局范围内,则在一个套接字上只会在发送方进行一次close()调用。我从这里开始调查更多。也许,看看how-to-forcibly-close-a-socket-in-time-wait ......