使用FIONREAD调用ioctl()会导致明显的竞争状态出现奇怪的副作用,

时间:2014-12-15 15:32:56

标签: c++ sockets race-condition ioctl

我正在编写一个并行神经网络模拟器,我最近在我的代码中遇到了一个完全让我感到困惑的问题(我认为我只是一个中级C ++程序员,所以也许我错过了一些明显的东西?),. ..我的代码涉及一个“服务器”和许多客户(工作人员)从中获取工作,并将结果返回给服务器。这是服务器部分:

#include <iostream>
#include <fstream>

#include <arpa/inet.h>
#include <sys/epoll.h>
#include <errno.h>

#include <sys/ioctl.h>

void advanceToNextInputValue(std::ifstream &trainingData, char &nextCharacter)
   {

      nextCharacter = trainingData.peek();
      while(nextCharacter != EOF && !isdigit(nextCharacter))
         {
sleep(1);
            trainingData.get();
sleep(1);
            nextCharacter = trainingData.peek();
         }
   }

int main()
   {
      // Create a socket,...
      int listenerSocketNum = socket(AF_INET, SOCK_STREAM, 0);

      // Name the socket,...
      sockaddr_in socketAddress;
      socklen_t socketAddressLength = sizeof(socketAddress);

      inet_pton(AF_INET, "127.0.0.1", &(socketAddress.sin_addr));
      socketAddress.sin_port = htons(9988);
      bind(listenerSocketNum, reinterpret_cast<sockaddr*>(&socketAddress), socketAddressLength);

      // Create a connection queue for worker processes waiting to connect to this server,...
      listen(listenerSocketNum, SOMAXCONN);


      int epollInstance = epoll_create(3); // Expected # of file descriptors to monitor

      // Allocate a buffer to store epoll events returned from the network layer
      epoll_event* networkEvents = new epoll_event[3];

      // Add the server listener socket to the list of file descriptors monitored by epoll,...
      networkEvents[0].data.fd = -1; // A marker returned with the event for easy identification of which worker process event belongs to
      networkEvents[0].events = EPOLLIN | EPOLLET; // epoll-IN- since we only expect incoming data on this socket (ie: connection requests from workers),...
                                                   // epoll-ET- indicates an Edge Triggered watch
      epoll_ctl(epollInstance, EPOLL_CTL_ADD, listenerSocketNum, &networkEvents[0]);


      char nextCharacter = 'A';
      std::ifstream trainingData;

      // General multi-purpose/multi-use variables,...
      long double v;
      signed char w;
      int x = 0;
      int y;

      while(1)
         {
            y = epoll_wait(epollInstance, networkEvents, 3, -1); // the -1 tells epoll_wait to block indefinitely

            while(y > 0)
               {
                  if(networkEvents[y-1].data.fd == -1) // We have a notification on the listener socket indicating a request for a new connection (and we expect only one for this testcase),...
                     {
                        x = accept(listenerSocketNum,reinterpret_cast<sockaddr*>(&socketAddress), &socketAddressLength);

                        networkEvents[y-1].data.fd = x; // Here we are just being lazy and re-using networkEvents[y-1] temporarily,...
                        networkEvents[y-1].events = EPOLLIN | EPOLLET;

                        // Add the socket for the new worker to the list of file descriptors monitored,...
                        epoll_ctl(epollInstance, EPOLL_CTL_ADD, x, &networkEvents[y-1]);

                        trainingData.open("/tmp/trainingData.txt");
                     }
                  else if(networkEvents[y-1].data.fd == x) // Worker is waiting to receive datapoints for calibration,...
                     {
                        std::cout << "nextCharacter before call to ioctl: " << nextCharacter << std::endl;
                        ioctl(networkEvents[y-1].data.fd, FIONREAD, &w);
                        std::cout << "nextCharacter after call to ioctl: " << nextCharacter << std::endl;

                        recv(networkEvents[y-1].data.fd, &v, sizeof(v), MSG_DONTWAIT); // Retrieve and discard the 'tickle' from worker

                        if(nextCharacter != EOF)
                           {
                              trainingData >> v;

                              send(networkEvents[y-1].data.fd, &v, sizeof(v), MSG_DONTWAIT);
                              advanceToNextInputValue(trainingData, nextCharacter);
                           }
                     }

                  y--;
               }
         }

      close(epollInstance);
      return 0;
   }

这是客户端部分:

#include <arpa/inet.h>

int main()
   {
      int workerSocket = socket(AF_INET, SOCK_STREAM, 0);

      // Name the socket as agreed with the server:
      sockaddr_in serverSocketAddress;
      serverSocketAddress.sin_family = AF_INET;
      serverSocketAddress.sin_port = htons(9988);
      inet_pton(AF_INET, "127.0.0.1", &(serverSocketAddress.sin_addr));

      // Connect your socket to the server's socket:
      connect(workerSocket, reinterpret_cast<sockaddr*>(&serverSocketAddress), sizeof(serverSocketAddress));

      long double z;
      while(1)
         {
            send(workerSocket, &z, sizeof(z), MSG_DONTWAIT); // Send a dummy result/tickle to server,...
            recv(workerSocket, &z, sizeof(z), MSG_WAITALL);
         }
   }

我遇到问题的代码部分如下(来自服务器):

std::cout << "nextCharacter before call to ioctl: " << nextCharacter << std::endl;
ioctl(networkEvents[y-1].data.fd, FIONREAD, &w);
std::cout << "nextCharacter after call to ioctl: " << nextCharacter << std::endl;

在这里(至少在我的系统上),在某些情况下,对ioctl的调用基本上消除了'nextCharacter'的值,我无法弄清楚如何或为什么!

这些是我期望获得的结果:

$ ./server.exe
nextCharacter before call to ioctl: A
nextCharacter after call to ioctl: A
nextCharacter before call to ioctl: 1
nextCharacter after call to ioctl: 1
nextCharacter before call to ioctl: 9
nextCharacter after call to ioctl: 9
nextCharacter before call to ioctl: 2
nextCharacter after call to ioctl: 2
nextCharacter before call to ioctl: 1
nextCharacter after call to ioctl: 1
nextCharacter before call to ioctl: 1
nextCharacter after call to ioctl: 1
nextCharacter before call to ioctl: 1
nextCharacter after call to ioctl: 1
nextCharacter before call to ioctl: 2
nextCharacter after call to ioctl: 2
nextCharacter before call to ioctl: ÿ
nextCharacter after call to ioctl: ÿ

(带有变音符号的小写'y'是文件结尾字符EOF)

这些是我得到的结果(请注意,我们最终会进入无限循环,因为停止条件依赖于nextCharacter的值并且已经消失,因此它永远不会停止):

$ ./server.exe
nextCharacter before call to ioctl: A
nextCharacter after call to ioctl:
nextCharacter before call to ioctl: 1
nextCharacter after call to ioctl:
nextCharacter before call to ioctl: 9
nextCharacter after call to ioctl:
nextCharacter before call to ioctl: 2
nextCharacter after call to ioctl:
nextCharacter before call to ioctl: 1
nextCharacter after call to ioctl:
nextCharacter before call to ioctl: 1
nextCharacter after call to ioctl:
nextCharacter before call to ioctl: 1
nextCharacter after call to ioctl:
nextCharacter before call to ioctl: 2
nextCharacter after call to ioctl:
nextCharacter before call to ioctl: ÿ
nextCharacter after call to ioctl:
nextCharacter before call to ioctl: ÿ
nextCharacter after call to ioctl:
nextCharacter before call to ioctl: ÿ
nextCharacter after call to ioctl:
.
.
.

如果我注释掉本节中的任何睡眠陈述(在服务器中):

void advanceToNextInputValue(std::ifstream &trainingData, char &nextCharacter)
   {

      nextCharacter = trainingData.peek();
      while(nextCharacter != EOF && !isdigit(nextCharacter))
         {
sleep(1);
            trainingData.get();
sleep(1);
            nextCharacter = trainingData.peek();
         }
   }

然后我得到了我期望得到的结果,......

这是我正在使用的makefile:

$ cat Makefile
all: server client

server: server.cpp
        g++ server.cpp -o server.exe -ansi -fno-elide-constructors -O3 -pedantic-errors -Wall -Wextra -Winit-self -Wold-style-cast -Woverloaded-virtual -Wuninitialized -Winit-self

client: client.cpp
        g++ client.cpp -o client.exe -ansi -fno-elide-constructors -O3 -pedantic-errors -Wall -Wextra -Winit-self -Wold-style-cast -Woverloaded-virtual -Wuninitialized -Winit-self

使用trainingData.txt文件,如下所示:

$ cat trainingData.txt
15616.16993666375,15616.16993666375,9.28693983312753E20,24.99528974548316,16.91935342923897,16.91935342923897,1.386594632397968E6,2.567209162871251

所以我发现了一个新的bug还是我只是愚蠢的? :)老实说,我不明白为什么用FIONREAD调用ioctl这应该告诉我等待读取的套接字上有多少字节,应该以任何方式影响变量'nextCharacter'的值,...

请注意,这是原始程序的削减版本,仍然能够重现问题(至少在我的系统上),所以请记住,在上面的代码片段中有些事情可能没有意义:)< / p>

特里

1 个答案:

答案 0 :(得分:0)

来自man ioctl_list

  

FIONREAD int *

也就是说,FIONREAD需要一个指向整数的指针,但是你传递一个指向signed char的指针。

解决方案:改变你的:

signed char w;

int w;

否则您将遭受未定义的行为

您所看到的解释是,编译器可能将wnextCharacter变量放在内存中,前者的溢出会覆盖后者的值。