Question

我正在尝试理解并重新创建一个简单的preforking服务器沿着unicorn行，服务器启动分叉4进程，所有进程都在控制套接字上等待（接受）。

控制套接字@control_socket绑定到9799并生成等待接受连接的4个worker。对每个工人所做的工作如下


        def spawn_child
            fork do
                $STDOUT.puts "Forking child #{Process.pid}"
                loop do 
                    @client = @control_socket.accept                                        
                    loop do                     
                        request = gets              

                        if request                          
                            respond(@inner_app.call(request))                           
                        else
                            $STDOUT.puts("No Request")
                            @client.close                           
                        end
                    end
                end
            end
        end

我使用了一个非常简单的机架应用程序，它只返回一个状态代码为200的字符串和一个text-type of text / html。

我面临的问题是，当我通过使用gets而不是像read来点击传入请求时（通过点击“http://localhost:9799”的网址），我的服务器正常运行或read_partial或read_nonblock。当我使用非阻塞读取时，它似乎永远不会抛出EOFError，根据我的理解，它意味着它没有收到EOF状态。

这会导致读取loop无法完成。这是完成这项工作的代码片段。


        # Reads a file using IO.read_nonblock
        # Returns end of file when using get but doesn't seem to return 
        # while using read_nonblock or readpartial
                # The fact that the method is named gets is just bad naming, please ignore
        def gets
            buffer = ""         
            i =0
            loop do
                puts "loop #{i}"
                i += 1
                begin
                    buffer << @client.read_nonblock(READ_CHUNK)
                    puts "buffer is #{buffer}"
                rescue  Errno::EAGAIN => e
                    puts "#{e.message}"
                    puts "#{e.backtrace}"
                    IO.select([@client])
                                        retry
                rescue EOFError
                    $STDOUT.puts "-" * 50
                    puts "request data is #{buffer}"    
                    $STDOUT.puts "-" * 50
                    break           
                end
            end
            puts "returning buffer"
            buffer
        end

但是，如果我使用简单的gets代替read或read_nonblock，或者将IO.select([@client])替换为break，则代码可以正常运行。< / p>

这是代码工作并返回响应的时间。我打算使用read_nonblock的原因是unicorn使用实现non_blocking读取的kgio库使用等价物。


def gets
  @client.gets
end

接下来粘贴整个代码。


require 'socket'
require 'builder'
require 'rack'
require 'pry'

module Server   
    class Prefork
        # line break 
        CRLF  = "\r\n"
        # number of workers process to fork
        CONCURRENCY = 4
        # size of each non_blocking read
        READ_CHUNK = 1024

        $STDOUT = STDOUT
        $STDOUT.sync

        # creates a control socket which listens to port 9799
        def initialize(port = 21)
            @control_socket = TCPServer.new(9799)
            puts "Starting server..."
            trap(:INT) {
                exit
            }
        end

        # Reads a file using IO.read_nonblock
        # Returns end of file when using get but doesn't seem to return 
        # while using read_nonblock or readpartial
        def gets
            buffer = ""         
            i =0
            loop do
                puts "loop #{i}"
                i += 1
                begin
                    buffer << @client.read_nonblock(READ_CHUNK)
                    puts "buffer is #{buffer}"
                rescue  Errno::EAGAIN => e
                    puts "#{e.message}"
                    puts "#{e.backtrace}"
                    IO.select([@client])
                                        retry
                rescue EOFError
                    $STDOUT.puts "-" * 50
                    puts "request data is #{buffer}"    
                    $STDOUT.puts "-" * 50
                    break           
                end
            end
            puts "returning buffer"
            buffer
        end

        # responds with the data and closes the connection
        def respond(data)
            puts "request 2 Data is #{data.inspect}"
            status, headers, body = data
            puts "message is #{body}"
            buffer = "HTTP/1.1 #{status}\r\n" \
                     "Date: #{Time.now.utc}\r\n" \
                     "Status: #{status}\r\n" \
                     "Connection: close\r\n"            
            headers.each {|key, value| buffer << "#{key}: #{value}\r\n"}          
            @client.write(buffer << CRLF)
            body.each {|chunk| @client.write(chunk)}            
        ensure 
            $STDOUT.puts "*" * 50
            $STDOUT.puts "Closing..."
            @client.respond_to?(:close) and @client.close
        end

        # The main method which triggers the creation of workers processes
        # The workers processes all wait to accept the socket on the same
        # control socket allowing the kernel to do the load balancing.
        # 
        # Working with a dummy rack app which returns a simple text message
        # hence the config.ru file read.
        def run         
            # copied from unicorn-4.2.1
            # refer unicorn.rb and lib/unicorn/http_server.rb           
            raw_data = File.read("config.ru")           
            app = "::Rack::Builder.new {\n#{raw_data}\n}.to_app"
            @inner_app = eval(app, TOPLEVEL_BINDING)
            child_pids = []
            CONCURRENCY.times do
                child_pids << spawn_child
            end

            trap(:INT) {
                child_pids.each do |cpid|
                    begin 
                        Process.kill(:INT, cpid)
                    rescue Errno::ESRCH
                    end
                end

                exit
            }

            loop do
                pid = Process.wait
                puts "Process quit unexpectedly #{pid}"
                child_pids.delete(pid)
                child_pids << spawn_child
            end
        end

        # This is where the real work is done.
        def spawn_child
            fork do
                $STDOUT.puts "Forking child #{Process.pid}"
                loop do 
                    @client = @control_socket.accept                                        
                    loop do                     
                        request = gets              

                        if request                          
                            respond(@inner_app.call(request))                           
                        else
                            $STDOUT.puts("No Request")
                            @client.close                           
                        end
                    end
                end
            end
        end
    end
end

p = Server::Prefork.new(9799)
p.run

有人可以向我解释为什么读取会因“read_partial”或“read_nonblock”或“read”而失败。我真的很感激一些帮助。

感谢。

Answer 1

首先我想谈谈一些基本知识，EOF意味着文件结束，就像信号会在没有更多数据可以从数据源读取时发送给调用者，例如，打开文件并在读取完整个文件后将收到一个EOF，或者只是简单地关闭io流。

然后这4种方法之间存在一些差异

gets从流中读取一行，在ruby中它使用$/作为默认行分隔符，但是您可以将参数作为行分隔符传递，因为如果客户端和服务器不是相同的操作系统，行分隔符可能不同，它是一个块方法，如果永远不会遇到行分隔符或EOF它会阻塞，并在收到EOF时返回nil，所以 {{1永远不会遇到gets 。
EOFError从流中读取长度字节，它是块方法，如果省略length，则它将阻塞直到读取EOF，如果有长度则返回一旦读取了一定数量的数据或满足EOF，并在收到EOF时返回空字符串，因此 read(length)永远不会遇到read 。
EOFError从流中读取最多的maxlen字节，它将读取可用数据并立即返回，它类似于readpartial(maxlen)的急切版本，如果数据太大则可以使用read代替readpartial来防止阻止，但它仍然是阻止方法，如果没有立即可用数据，它会阻止， read会引发readpartial如果收到EOF 。
EOFError与read_nonblock(maxlen)类似，但就像名称所说的那样，它是非阻止方法，即使没有可用数据，也会立即引发readpartial它意味着现在没有数据，你应该关心这个错误，通常在Errno::EAGAIN救援子句中应首先调用Errno::EAGAIN以减少不必要的循环，它将阻塞直到conn可用于读取，然后{{ 1}}，如果收到EOF ， IO.select([conn])会引发retry。

现在让我们看看你的例子，因为我看到你正在做的是首先尝试通过“点击url”来读取数据，它只是一个HTTP GET请求，一些文本如“GET / HTTP / 1.1 \ r \ n” ，默认情况下，连接在HTTP / 1.1中保持活动状态，因此除非在您的请求中添加read_nonblock标头，否则使用EOFError或readpartial将永远不会收到EOF，或者更改您的获取方法，如下所示：

read_nonblock

你不能在这里使用Connection: close，因为你不知道请求包的确切长度，使用大的长度或者只是简单地省略会导致阻塞。

Ruby readpartial和read_nonblock没有抛出EOFError

1 个答案: