我需要为分析项目解析gedcom 5.5文件。 我找到的第一个ruby解析器存在堆栈级别太深的错误,所以我试图寻找替代方案。我想一下这个项目:https://github.com/jslade/gedcom-ruby
包含了一些样本,但我也没有让它们工作。
以下是解析器本身:https://github.com/jslade/gedcom-ruby/blob/master/lib/gedcom.rb
如果我尝试这样的样本:
ruby ./samples/count.rb ./samples/royal.ged
我收到以下错误:
D:/rails_projects/gedom_test/lib/gedcom.rb:185:in `readchar': end of file reached (EOFError)
我在每个方法中写了一个“获取”以获得更好的解释,这是异常引发的输出:
Parsing './samples/royal.ged'...
INIT
BEFORE
CHECK_PROC_OR_BLOCK
BEFORE
CHECK_PROC_OR_BLOCK
PARSE
PARSE_FILE
PARSE_IO
DETECT_RS
导致问题的确切行是
while ch = io.readchar
<_>在detect_rs方法中:
# valid gedcom may use either of \r or \r\n as the record separator.
# just in case, also detects simple \n as the separator as well
# detects the rs for this string by scanning ahead to the first occurence
# of either \r or \n, and checking the character after it
def detect_rs io
puts "DETECT_RS"
rs = "\x0d"
mark = io.pos
begin
while ch = io.readchar
case ch
when 0x0d
ch2 = io.readchar
if ch2 == 0x0a
rs = "\x0d\x0a"
end
break
when 0x0a
rs = "\x0a"
break
end
end
ensure
io.pos = mark
end
rs
end
我希望有人能帮助我。
答案 0 :(得分:1)
Ruby的readchar
类的IO
方法在遇到文件末尾时会引发EOFError
。 http://www.ruby-doc.org/core-2.1.1/IO.html#method-i-readchar
gedcom-ruby
宝石多年来一直没有被触及过,但是有一些分支可以解决这个问题。
基本上它改变了:
while ch = io.readchar
到
while !io.eof && ch = io.readchar
你可以在这里获得宝石的分叉:https://github.com/trentlarson/gedcom-ruby