Question

我正在尝试设置用ruby编写的脚本来打开服务器上的端口2004。使用http调用服务器，然后端口http://<server>:2004/将返回HTTP标头+响应。从文件中读取响应。这适用于小内容，但不适用于类似50MB的内容。不知怎的，它只是打破了。顺便说一下，我用SoapUI测试这个脚本。

这是源代码，我认为这个是非常明显的。为了更好的阅读，我将响应部分标记为大。

#!/bin/ruby

require 'socket'
require 'timeout'
require 'date'

server = TCPServer.open 2004
puts "Listening on port 2004"

#file="dump.request"

loop {
    Thread.start(server.accept) do |client|
        date = Time.now.strftime("%d-%m-%Y_%H-%M-%S")
        file = "#{date}_mt_dump.txt"
        puts date
        puts "Accepting connection"
        #client = server.accept
        #resp = "OKY|So long and thanks for all the fish!|OKY"
        ticket_id = "1235"


        partial_data = ""
        i = 1024
        firstrun = "yes"
        fd = File.open(file,'w')
        puts "Attempting receive loop"

        puts "Ready to transfer contents to the client"
        f = File.open("output.txt.gz","r")
        puts "Opened file output.txt.gz; size: #{f.size}"
        resp = f.read(f.size)

        headers = ["HTTP/1.1 200 OK",
             "Content-Encoding: gzip",
             "Content-Type: text/xml;charset=UTF-8",
             "Content-Length: #{f.size}\r\n\r\n"].join("\r\n")
        client.puts headers

        #puts all_data.join()
        fd.close unless fd == nil

        puts "Start data transfer"
        client.puts resp
        client.close
        puts "Closed connection"
        puts "\n"
    end
}

Answer 1

我在您的代码中看到了许多问题，其中一些是概念性的，一些是技术性的，但是如果没有关于您收到的错误的更多信息可能无法提供正确的响应。

我最初认为问题是由于您在不使用二进制模式标志的情况下打开Gzipped文件，因此文件读取会停止吃掉第一个EOF字符并且可能会转换新的行标记。

需要考虑的一些技术问题：

你的循环是无限的。您应该设置信号陷阱以允许您退出脚本（例如，捕获^C）。
Zip文件通常是二进制文件。您应该使用二进制模式打开文件，或者如果将整个文件加载到内存中，请使用IO.binread方法。
在发送之前将整个文件加载到内存中。这对于小文件来说非常棒，但对于较大的文件来说，这不是最佳方法。为每个客户端加载50MB的RAM，同时为100个客户端提供服务，意味着5GB的RAM ......

考虑前两个技术要点，我会稍微调整一下代码：

keep_running = true
trap('INT'){ keep_running = false ; raise ::SystemExit}

begin
    while(run) {
        Thread.start(server.accept) do |client|
            date = Time.now.strftime("%d-%m-%Y_%H-%M-%S")
            file = "#{date}_mt_dump.txt"
            puts date
            puts "Accepting connection"
            #client = server.accept
            #resp = "OKY|So long and thanks for all the fish!|OKY"
            ticket_id = "1235"


            partial_data = ""
            i = 1024
            firstrun = "yes"
            fd = File.open(file,'bw')
            puts "Attempting receive loop"

            puts "Ready to transfer contents to the client"
            f = File.open("output.txt.gz","br")
            puts "Opened file output.txt.gz; size: #{f.size}"
            resp = f.read(f.size)

            headers = ["HTTP/1.1 200 OK",
                 "Content-Encoding: gzip",
                 "Content-Type: text/xml;charset=UTF-8",
                 "Content-Length: #{f.size}\r\n\r\n"].join("\r\n")
            client.puts headers

            #puts all_data.join()
            fd.close unless fd == nil

            puts "Start data transfer"
            client.puts resp
            client.close
            puts "Closed connection"
            puts "\n"
        end
    }
rescue => e
    puts e.message
    puts e.backtrace
rescue SystemExit => e
    puts "exiting... please notice that existing threads will be brutally stoped, as we will not wait for them..."
end

关于我的更一般的指示：

您的代码正在为每个连接打开一个新线程。虽然这对于少量并发连接是可以的，但如果您有大量并发连接，您的脚本可能会停止运行。单独的上下文切换（在线程之间移动）可能会产生DoS情况。

我建议你使用Reactor模式，你有一个线程池。另一种选择是分叉一些进程来监听相同的TCPSocket。
您没有从套接字读取数据并且您没有解析HTTP请求 - 这意味着有人可能通过连续发送数据来填充您从未清空的系统缓冲区。

如果您从套接字中读取信息，或清空它的缓冲区，以及与任何格式错误的连接断开连接，那会更好。

此外，大多数浏览器在请求之前响应时都不太高兴...

您不会捕获任何异常，也不会打印任何错误消息。这意味着您的脚本可能抛出一个异常，将一切分开。例如，如果您的“服务器”将达到其进程的“打开文件限制”，accept方法将抛出一个异常，该异常将关闭整个脚本，包括现有连接。

我不确定为什么你没有使用Ruby可用的众多HTTP服务器之一 - 无论是内置的WEBrick（不用于生产）还是其中一个本机Ruby社区宝石，例如{{ 3}}

这是一个使用Iodine的简短示例，它有一个易于使用的Ruby编写的Http服务器（无需编译任何东西）：

require 'iodine/http'

# cache the file, since it's the only response ever sent
file_data = IO.binread "output.txt.gz"

Iodine.on_http do |request, response|
        begin
            # set any headers
            response['content-type'] = 'text/xml;charset=UTF-8'
            response['content-encoding'] = 'gzip'
            response << file_data
            true
        rescue => e
            Iodine.error e
            false
        end
    end
end

#if in irb:
exit

或者，如果您坚持编写自己的HTTP服务器，至少可以使用一个可用的IO反应器，例如Iodine（我为Iodine编写的），以帮助您处理线程池和IO管理（你也可以使用EventMachine，但我不喜欢这么多 - 再次，我有偏见，因为我写了Iodine Library）：

require 'iodine'
require 'stringio'

class MiniServer < Iodine::Protocol

    # cache the file, since it's the only data sent,
    # and make it available to all the connections.
    def self.data
        @data ||= IO.binread 'output.txt.gz'
    end

    # The on_opne callback is called when a connection is established.
    # We'll use it to initialize the HTTP request's headers Hash.
    def on_open
     @headers = {}
    end

    # the on_message callback is called when data is sent from the client to the socket.
    def on_message input
        input = StringIO.new input
        l = nil
        headers = @headers # easy access
        # loop the lines and parse the HTTP request.
        while (l = input.gets)
            unless l.match /^[\r]?\n/
                if l.include? ':'
                    l = l.strip.downcase.split(':', 2)
                    headers[l[0]] = l[1]
                else
                    headers[:method], headers[:query], headers[:version] = l.strip.split(/[\s]+/, 3)
                    headers[:request_start] = Time.now
                end
                next
            end
            # keep the connection alive if the HTTP version is 1.1 or if the connection is requested to be kept alive
            keep_alive = (headers['connection'].to_s.match(/keep/i) || headers[:version].match(/1\.1/)) && true
            # refuse any file uploads or forms. make sure the request is a GET request
            return close if headers['content-length'] || headers['content-type'] || headers[:method].to_s.match(/get/i).nil?
            # all is well, send the file.
            write ["HTTP/1.1 200 OK",
                    "Connection: #{keep_alive ? 'keep-alive' : 'close'}",
                     "Content-Encoding: gzip",
                     "Content-Type: text/xml;charset=UTF-8",
                     "Content-Length: #{self.class.data.bytesize}\r\n\r\n"].join("\r\n")
            write self.class.data
            return close unless keep_alive

            # reset the headers, in case another request comes in
            headers.clear
        end
    end

end

Iodine.protocol = MiniServer
# # if running within a larget application, consider:
# Iodine.force_start!
# # Server starts automatically when the script ends.
# # on irb, use `exit`:
exit

祝你好运！

发送大量压缩内容的ruby http响应脚本

1 个答案: