使用DocumentBuilder解析InputStreamReader

时间:2011-12-16 16:41:12

标签: java file groovy utf-8 document

我的Java经验很少。我正在尝试强制将文档读取为UTF-8,但是试图将InputStream读取器挂钩到文档构建器时会遇到困难。

这是我到目前为止所拥有的:

import javax.xml.xpath.*;
import javax.xml.parsers.*;
import org.w3c.dom.*;


if( pathToFile == null ) throw new Exception("You must supply a pathToFile parameter");

DocumentBuilder builder = DocumentBuilderFactory.newInstance().newDocumentBuilder();

InputStreamReader in = new InputStreamReader( new FileInputStream( pathToFile ), "utf-8" );

BufferedReader reader = new BufferedReader ( new InputStreamReader ( in ) );

Element records = builder.parse(reader).getDocumentElement();

如果有人可以提供一些指示,请不胜感激

2 个答案:

答案 0 :(得分:8)

不要在InputStreamReader周围包裹InputStreamReader。 (编辑另外,由于没有方法从Reader解析XML,您需要将阅读器包装在InputSource中):

if( pathToFile == null )
    throw new Exception("You must supply a pathToFile parameter");

DocumentBuilder builder = DocumentBuilderFactory.newInstance()
    .newDocumentBuilder();

InputStreamReader in = new InputStreamReader(
    new FileInputStream( pathToFile ), "utf-8" );

BufferedReader reader = new BufferedReader ( in ); // CHANGED

InputSource input = new InputSource(reader);

Element records = builder.parse(input).getDocumentElement();

答案 1 :(得分:2)

假设这是Groovy,你可以摆脱很多Java残余:

没试过,但是:

if( pathToFile == null ) throw new Exception("You must supply a pathToFile parameter");

Element records = new File( pathToFile ).withReader( "utf-8" ) { r ->
  DocumentBuilderFactory.newInstance().newDocumentBuilder().with { b ->
    b.parse( new InputSource( r ) ).documentElement
  }
}

应该有用......