如何使用Scala验证XML文件的模式?

时间:2009-10-26 20:19:28

标签: xml scala

我写了一个简单的scala程序来打开一个XML文件。

有没有办法让scala根据它引用的模式文件验证XML文件?目前我的XML文件不遵循架构,因此我希望验证时会出错。

XML文件在根元素中引用这样的模式:

<items xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="items.xsd">

scala代码:

import scala.xml._

object HelloWorld {
  def main(args: Array[String]) {
    println("Hello, world! " + args.toList)

    val start = System.currentTimeMillis
    val data = XML.loadFile(args(0))
    val stop = System.currentTimeMillis
    Console.println("Took " + (stop-start)/1000.0 + "s to load " + args(0))
  }
}
HelloWorld.main(args)

3 个答案:

答案 0 :(得分:6)

这是一篇博客文章,描述了如何在Scala中使用Java库进行模式验证:

http://sean8223.blogspot.com/2009/09/xsd-validation-in-scala.html

归结为XML.load的基本重新实现:

import javax.xml.parsers.SAXParser
import javax.xml.parsers.SAXParserFactory
import javax.xml.validation.Schema
import javax.xml.validation.ValidatorHandler
import org.xml.sax.XMLReader

class SchemaAwareFactoryAdapter(schema:Schema) extends NoBindingFactoryAdapter {

  override def loadXML(source: InputSource): Elem = {
    // create parser
    val parser: SAXParser = try {
      val f = SAXParserFactory.newInstance()
      f.setNamespaceAware(true)
      f.setFeature("http://xml.org/sax/features/namespace-prefixes", true)
      f.newSAXParser()
    } catch {
      case e: Exception =>
        Console.err.println("error: Unable to instantiate parser")
        throw e
    }

    val xr = parser.getXMLReader()
    val vh = schema.newValidatorHandler()
    vh.setContentHandler(this)
    xr.setContentHandler(vh)

    // parse file
    scopeStack.push(TopScope)
    xr.parse(source)
    scopeStack.pop
    return rootElem.asInstanceOf[Elem]
  }
}

答案 1 :(得分:2)

我认为你不能用Scala库做到这一点。 但你绝对可以使用Java库。只需谷歌“java架构验证”,你会发现很多选项

答案 2 :(得分:2)

以下是对2.8.0(或2.8.1)中的次要API更改的修改:

import org.xml.sax.InputSource
import scala.xml.parsing.NoBindingFactoryAdapter
import scala.xml.{TopScope, Elem}
import javax.xml.parsers.{SAXParserFactory, SAXParser}
import javax.xml.validation.Schema

class SchemaAwareFactoryAdapter(schema: Schema) extends NoBindingFactoryAdapter {
    override def loadXML(source: InputSource, parser: SAXParser) = {
        val reader = parser.getXMLReader()
        val handler = schema.newValidatorHandler()
        handler.setContentHandler(this)
        reader.setContentHandler(handler)

        scopeStack.push(TopScope)
        reader.parse(source)
        scopeStack.pop
        rootElem.asInstanceOf[Elem]
    }

    override def parser: SAXParser = {
        val factory = SAXParserFactory.newInstance()
        factory.setNamespaceAware(true)
        factory.setFeature("http://xml.org/sax/features/namespace-prefixes", true)
        factory.newSAXParser()
    }
}

申请也略有不同:

val factory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI)
val xsdStream = getClass.getResourceAsStream("/foo.xsd")
val schema = factory.newSchema(new StreamSource(stream))
val source = getClass.getResourceAsStream("baz.xml")
val xml = new SchemaAwareFactoryAdapter(schema).load(source)