ruby gedcom解析器EOF异常

时间:2014-05-04 09:51:12

标签: ruby parsing gedcom

我需要为分析项目解析gedcom 5.5文件。 我找到的第一个ruby解析器存在堆栈级别太深的错误,所以我试图寻找替代方案。我想一下这个项目:https://github.com/jslade/gedcom-ruby

包含了一些样本,但我也没有让它们工作。

以下是解析器本身:https://github.com/jslade/gedcom-ruby/blob/master/lib/gedcom.rb

如果我尝试这样的样本:

ruby ./samples/count.rb ./samples/royal.ged

我收到以下错误:

D:/rails_projects/gedom_test/lib/gedcom.rb:185:in `readchar': end of file reached (EOFError)

我在每个方法中写了一个“获取”以获得更好的解释,这是异常引发的输出:

Parsing './samples/royal.ged'...
INIT
BEFORE
CHECK_PROC_OR_BLOCK
BEFORE
CHECK_PROC_OR_BLOCK
PARSE
PARSE_FILE
PARSE_IO
DETECT_RS

导致问题的确切行是

while ch = io.readchar
<_>在detect_rs方法中:

# valid gedcom may use either of \r or \r\n as the record separator.
# just in case, also detects simple \n as the separator as well
# detects the rs for this string by scanning ahead to the first occurence
# of either \r or \n, and checking the character after it
def detect_rs io
puts "DETECT_RS"
  rs = "\x0d"
  mark = io.pos
  begin
    while ch = io.readchar
      case ch
      when 0x0d
        ch2 = io.readchar
        if ch2 == 0x0a
          rs = "\x0d\x0a"
        end
        break
      when 0x0a
        rs = "\x0a"
        break
      end
    end
  ensure
    io.pos = mark
  end
  rs
end

我希望有人能帮助我。

1 个答案:

答案 0 :(得分:1)

Ruby的readchar类的IO方法在遇到文件末尾时会引发EOFErrorhttp://www.ruby-doc.org/core-2.1.1/IO.html#method-i-readchar

gedcom-ruby宝石多年来一直没有被触及过,但是有一些分支可以解决这个问题。

基本上它改变了:

while ch = io.readchar

while !io.eof && ch = io.readchar

你可以在这里获得宝石的分叉:https://github.com/trentlarson/gedcom-ruby