Question

我想使用XPath（在Java中）来解析XML文件。但是这些XML文件只能在网上获得（手动下载它们不是一个选项（当然它们必须“下载”才能被处理）。

所以基本上我的问题是如何获取URI对象并将其转换为File对象。我是否需要使用SCP或其中之间的东西来下载文件。任何代码，教程或一般建议都将非常感激。

我试过这个：

    URI uri = new URI("http://www.somefiles.com/myfile.xml");
    InputStream is = uri.toURL().openStream();
    File xmlDocument = new File(uri);

但这会导致URI scheme is not "file"错误。

Answer 1

您可以将其设置得更复杂，但这可以像来自网址的opening a stream一样简单。

InputStream in = remoteURI.toURL().openStream();

现在这不是最初请求的File对象，但我猜你的XPath库可以处理泛型InputStream。如果不是，您必须将上面的InputStream保存到临时文件中并在其上创建一个File对象。

Answer 2

首先尝试将XML写入磁盘：

File tempDir = new File(System.getProperty("java.io.tmpdir"));
File xmlDocument = new File(tempDir, "theXml.xml");
InputStream in = remoteURI.toURL().openStream();
OutputStream out = new FileOutputStream(xmlDocument);
int read;
while ((read = in.read()) != -1){
  out.write(read);
}
in.close();
out.close();

但是，如果您只需要使用XPath从XML中提取一些数据，则无需向磁盘写入任何内容：

InputStream in = remoteURI.toURL().openStream();
StreamSource source = new StreamSource(in);
DOMResult result = new DOMResult();
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.transform(source, result);
Document document = (Document)result.getNode();

XPath xpath = XPathFactory.newInstance().newXPath();
xpath.evaluate("...", document);

从Web检索并解析XML

2 个答案: