如何单独迭代两个数组

时间:2015-06-29 22:44:01

标签: arrays ruby iterator zip enumerator

我自己的小项目是根据时间戳合并两个日志,两个日志都有相同的时间戳。某些行没有时间戳,应使用带时间戳的行打印。

所以,如果我有这样的日志:

loop do
  code
end

另一个文件的方式相同,但时间戳不同。 作为我对Ruby的新手,我发现这个项目可能是一个很好的开始方式。

到目前为止,我已经找到足够的帮助,我应该使用枚举器,我猜一个

eq

但是,我如何决定何时迭代file1而不使用file2也会迭代? 如何找出一个迭代器何时位于文件末尾,以便打印其他文件的其余部分?

我应该首先读取每个数组的文件,还是只为每个文件工作两个流,然后将一个流传输到输出文件?

摘要:我希望迭代两个文件,直到一个文件到达结尾,然后打印另一个文件中的最后一行,并控制何时在两个文件中进行迭代。

感谢您的时间和意见!

**编辑:**

但我想将它们与时间戳合并在一起。像:
2015-06-25 09:20:24,123 file1 text1

2015-06-25 09:20:23,123 file2 text1
2015-06-25 09:20:26,123 file2 text2

输出:
2015-06-25 09:20:23,123 file2 text1
2015-06-25 09:20:24,123 file1 text1
2015-06-25 09:20:26,123 file2 text2

基本上,如果我有两个数组,我用迭代器x和y迭代。 如果x>然后将y放在输出文件中并像y ++一样继续检查它们,直到文件结束为止。如果x是eof,只需将y的其余部分添加到输出文件中。

1 个答案:

答案 0 :(得分:0)

好的,这就是我最终想出来的。我相信它应该适合您,具体如下:

  • 如果每个文件的当前行以可解析的日期开头,则最早的获胜(将第49行的<更改为>以进行交换)。
  • 当一行被写入输出时,循环读取该行的文件并抓取下一行。来自另一个文件的行保持不变,直到该行轮到它为止。
  • 如果某行没有以可解析的日期开头,则该行获胜并被写入。如上所述,该文件被循环并且下一行被拉入,重复直到再次出现可解析的日期或者它到达文件的末尾。
  • 如果其中一个文件到达其末尾,则另一个文件将流式传输到输出,直到它也到达其输出。

请注意,您需要将'file1''file2''output'更改为文件路径,相对或绝对路径。您可以使用ARGVOptionParser使用命令行参数将此数据传递到程序中。

输入文件:

# file1

2014-06-21 07:20:25,654 file1 text2
2015-01-13 14:24:23,654 file1 text1
test text1 belongs to the row above
2015-06-21 08:57:27,654 file1 text3
# file2

2013-01-05 19:27:25,654 file1 text2
2015-04-01 10:13:23,654 file1 text1
test text5 belongs to the row above
2015-06-23 09:49:27,654 file1 text3
# output

2013-01-05 19:27:25,654 file1 text2
2014-06-21 07:20:25,654 file1 text2
2015-01-13 14:24:23,654 file1 text1
test text1 belongs to the row above
2015-04-01 10:13:23,654 file1 text1
test text5 belongs to the row above
2015-06-21 08:57:27,654 file1 text3
2015-06-23 09:49:27,654 file1 text3
# compile_files.rb

require 'date'

# Attempt to read a line from the supplied file.
# If this fails, we are at the end of the file and return nil.
def read_line_from_file(file)
  file.readline
rescue EOFError
  nil
end

# Parse the date which is at the beginning of the supplied text.
# If this fails, it doesn't start with a date so we return nil.
def parse_date(text)
  DateTime.parse(text)
rescue ArgumentError
  nil
end

begin
  # Open the files to sort
  input_file_1 = File.open('file1', 'r')
  input_file_2 = File.open('file2', 'r')

  # Open the file that will be written. Here it is named "output"
  File.open('output', 'w+') do |of|
    # Read the first line from each file
    left = read_line_from_file(input_file_1)
    right = read_line_from_file(input_file_2)

    # Loop until BOTH files have reached their end
    until left.nil? && right.nil?
      # If the first file was successfully read, 
      # attempt to parse the date at the beginning of the line
      left_date = parse_date(left) if left
      # If the second file was successfully read, 
      # attempt to parse the date at the beginning of the line
      right_date = parse_date(right) if right

      # If the first file was successfully read, 
      # but the date was not successfully parsed,
      # the line is a stack trace and needs to be printed
      # because it will be following the related
      # timestamped line.
      if left && left_date.nil?
        of << left

        # Now that we have printed that line, 
        # grab the next one from the same file.
        left = read_line_from_file(input_file_1)
        next

      # If the first file was successfully read, 
      # but the date was not successfully parsed,
      # the line is a stack trace and needs to be printed
      # because it will be following the related
      # timestamped line.
      elsif right && right_date.nil?
        of << right

        # Now that we have printed that line, 
        # grab the next one from the same file.
        right = read_line_from_file(input_file_2)

        # Skip straight to the next iteration of the `until` loop.
        next
      end

      if left.nil?
        of << right

        # Now that we have printed that line, 
        # grab the next one from the same file.
        right = read_line_from_file(input_file_2)

        # Skip straight to the next iteration of the `until` loop.
        next
      end

      # If we got this far, neither of the lines were stack trace
      # lines. If the second file has reached its end, we need
      # to print the line we grabbed from the first file.
      if right.nil?
        of << left

        # Now that we have printed that line, 
        # grab the next one from the same file.
        left = read_line_from_file(input_file_1)

        # Skip straight to the next iteration of the `until` loop.
        next
      end

      # ADDED THIS SECTION
      # If we got this far, the second file has not
      # reached its end. If the first file has reached
      # its end, we need to print the line we grabbed 
      # from the second file.
      if left.nil?
        of << right

        # Now that we have printed that line, 
        # grab the next one from the same file.
        right = read_line_from_file(input_file_2)

        # Skip straight to the next iteration of the `until` loop.
        next
      end

      # If we got this far, neither file has reached its 
      # end and both start with timestamps. If the first file's
      # timestamp is less, it is older.
      if left_date < right_date
        of << left

        # Now that we have printed that line, 
        # grab the next one from the same file.
        left = read_line_from_file(input_file_1)

        # Skip straight to the next iteration of the `until` loop.
        next
      # Either the timestamps were the same or the second one is
      # older.
      else
        of << right

        # Now that we have printed that line, 
        # grab the next one from the same file.
        right = read_line_from_file(input_file_2)

        # Skip straight to the next iteration of the `until` loop.
        next
      end
    end
  end
ensure
  # Make sure that the file descriptors are close.
  input_file_1.close
  input_file_2.close
end