How to validate big XML

时间:2015-09-14 15:51:47

标签: ruby-on-rails ruby xml validation nokogiri

I am trying to validate an XML file against an XSD using Nokogiri. When the file is small I use document method validation:

xsd = Nokogiri::XML::Schema(File.read(Rails.root.join('files/xsd', self::XSD)))
xml = Nokogiri::XML(File.read(Rails.root.join('public/uploads', file_path)))
xsd.validate(xml).each do |error|
end

When the file is big, the previous method is not good because it needs a lot of resources, so I need file method validation:

xsd = Nokogiri::XML::Schema(File.read(Rails.root.join('files/xsd', self::XSD)))
xml = Rails.root.join('public/uploads', file_path).to_s
xsd.validate(xml).each do |error|
end

But the second way doesn't show me simple errors such as un-closed double quotes in the attribute:

<?xml version="1.0"?>
<catalog version="123 xmlns="http://google.com">
   <book id="bk101">

and the first does.

1 个答案:

答案 0 :(得分:1)

Nokogiri is a great tool for small to medium size XML, but when you get into large to extremely-large/huge files, you need to switch to other tools such as SAX parsing or, for validation, something like xmllint.

The xmllint program parses one or more XML files, specified on the command line as xmlfile. It prints various types of output, depending upon the options selected. It is useful for detecting errors both in XML code and in the XML parser itself.

It is included in libxml2.