我有一个程序,我设计在一堆不同的节点上运行,根据从主节点上运行的进程获得的指令,在它们之间传输文件。每个进程都充当发送方和接收方。
我的策略是:
//Setup a listener socket on port 63000(say) to listen to commands from the master
//socket. This listener will only listen for commands and read these commands
/* 1) Setup a listener socket on port 63001(say) to listen to connection requests from
(NUMNODES-1) other processes on the rest of the nodes
2) Accept each of the connections from the rest of the (NUMNODES-1) nodes and fill
these conneced descriptors into an array of integers */
for(i=1;i<NUMNODES;i++){
connfd=accept(...,&node_addr,...);
index=findNodeIndex(node_addr); //To find the node number, corresp to this address
connections[index]=connfd;
}
/* Skipping all details aside, assuming i have a process running on node number 4 (out
of a total of 10 nodes, say) connections[4] will be -1, and connections[1..3, 5..10]
will hold connection descriptors to each of the other nodes, ready to read any file
that they decide to transfer */
fd_set read_set;
for(i=1;i<=NUMNODES;i++){
if(i==thisNodeNum()) //For nodenum 4, the connections[4] will be -1
continue; //So we don't want to put it into the read fd_set
FD_SET(connections[i],&read_set);
}
fd_set temp_rset;
//A select() call ready to listen to any data that comes in from any of the nodes
for(;;){
temp_rset=read_set;
select(maxfdp1,&temp_rset,NULL,NULL,NULL);
//Listening for commands from master goes here
if(FD_ISSET(commandListener,&temp_rset){
... listen to command and send file to a particular node...
... File will be sent to port 63001 on that node...
... commandLIstener listens for commands on port 63000(just a reminder)...
}
//Listening for data that has come in from the other nodes
for(i=1;i<=NUMNODES;i++){
if(FD_ISSET(connections[i],&temp_rset){
... write file to the local hard disk, byte by byte...
}
}//End of connections[1..NUMNODES] testing for data to read
}//End of infinite for loop
我的问题是我的主人将命令发送到它在端口号63001上喜欢的任何节点,并且命令被接收并采取行动。文件逐字节地发送到适当的节点(比如,主命令节点5将文件发送到节点9 ......节点5上的进程将利用连接[9]将文件发送到节点9上的进程。 ..节点9上的进程将接收关于连接的数据[5] ......至少这就是我想要发生的事情)
文件被发送到正确节点(节点9 @ port 63001)上的相应端口, 但节点9上的 FD_ISSET(connections [i],&amp; temp_rset)条件从未检测到任何已发送的数据。我已经使用 tshark 和 tcpdump 进行了检查,数据确实被发送到节点9,但是select()调用从不接收任何内容。
我做错了什么?
答案 0 :(得分:1)
您的代码应如下所示:
fd_set read_set; // set of file descriptors (in this case sockets)
int result;
for(;;)
{
FD_ZERO(&read_set); // you need to clear the set first!
FD_SET(commandListener, &read_set); // add command socket to the set
for(i=1; i<=NUMNODES; i++) // add node sockets to the set
{
if(i==thisNodeNum()) continue;
FD_SET(connections[i], &read_set);
}
// check status of all sockets from the set
result = select(maxfdp1, &read_set, NULL, NULL, NULL); // instead of tracking maxfdp1 you can use FD_SETSIZE (maximum allowed size of set)
if(result == -1) // error
{
perror("select() error");
}
else if(result > 0) // there is new data
{
if(FD_ISSET(commandListener, &read_set) // there is new command data
{
// (...) handling your tasks here
}
for(i=1; i<=NUMNODES; i++)
{
if(FD_ISSET(connections[i], &read_set) // there is new data on i-th node
{
// (...) handling your tasks here
}
}
}
}//End of infinite for loop
注意:
您忘记清除并重建代码中的文件描述符(套接字)集(选择修改它 - 请注意稍后使用FD_ISSET进行检查)。
您可以将所有套接字放在一个套件中并使用select一次检查所有套接字,然后识别哪些套接字有新数据(如果有)。
我不确定如何索引表&#34; connections&#34;,因为代码中的循环是1..NUMNODES而不是0..NUMNODES-的自然C / C ++表索引1。
答案 1 :(得分:1)
您的接受循环接受NUMNODES-1
个其他节点的连接。因此,如果您有NUMNODES
个节点都运行此代码,这意味着所有节点都必须对每个其他节点执行connect
。这意味着您将在每对节点之间建立两个连接,一个从每个方向启动。
现在,当您为select循环设置read_set
时,看起来您只是查看您接受的连接,而不是您所谓的连接。如果您还在接受的连接(而不是已连接的连接)上发送数据,则另一端的进程不会注意到它,因为它正在等待接受 接受连接而非连接的连接。