具体来说,我使用dom4j读取KML文档并解析XML中的一些数据。当我将字符串形式的URL传递给读者时,它非常简单并处理文件系统URL和Web URL:
SAXReader reader = new SAXReader();
Document document = reader.read(url);
问题是,有时我的代码需要处理KMZ文档,这些文档基本上只是压缩的XML(KML)文档。不幸的是,使用SAXReader没有方便的方法来解决这个问题。我找到各种时髦的解决方案来确定任何给定的文件是否是ZIP文件,但是我的代码很快就变得烦躁和讨厌 - 读取流,构建文件,检查开头的“魔术”十六进制字节,提取等
有没有一些快速而干净的方法来处理这个问题?连接到任何URL并在压缩后提取内容的更简单方法,否则只需抓取XML?
答案 0 :(得分:0)
嗯,看起来KMZDOMLoader似乎不能处理网络上的kmz文件。 kmz可能是动态加载的,所以它并不总是a)文件引用或b)特别是.kmz扩展 - 它必须由内容类型决定。
我最终做的是构建一个URL对象,然后获取协议。我有单独的逻辑来处理Web上的本地文件或文档。然后在每个逻辑块内部,我必须确定它是否被压缩。 SAXReader read()方法接受输入流,所以我发现我可以使用ZipInputStream来获取kmzs。
这是我最终得到的代码:
private static final long ZIP_MAGIC_NUMBERS = 0x504B0304;
private static final String KMZ_CONTENT_TYPE = "application/vnd.google-earth.kmz";
private Document getDocument(String urlString) throws IOException, DocumentException, URISyntaxException {
InputStream inputStream = null;
URL url = new URL(urlString);
String protocol = url.getProtocol();
/*
* Figure out how to get the XML from the URL -- there are 4 possibilities:
*
* 1) a KML (uncompressed) doc on the filesystem
* 2) a KMZ (compressed) doc on the filesystem
* 3) a KML (uncompressed) doc on the web
* 4) a KMZ (compressed) doc on the web
*/
if (protocol.equalsIgnoreCase("file")) {
// the provided input URL points to a file on a file system
File file = new File(url.toURI());
RandomAccessFile raf = new RandomAccessFile(file, "r");
long n = raf.readInt();
raf.close();
if (n == KmlMetadataExtractorAdaptor.ZIP_MAGIC_NUMBERS) {
// the file is a KMZ file
inputStream = new ZipInputStream(new FileInputStream(file));
((ZipInputStream) inputStream).getNextEntry();
} else {
// the file is a KML file
inputStream = new FileInputStream(file);
}
} else if (protocol.equalsIgnoreCase("http") || protocol.equalsIgnoreCase("https")) {
// the provided input URL points to a web location
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.connect();
String contentType = connection.getContentType();
if (contentType.contains(KmlMetadataExtractorAdaptor.KMZ_CONTENT_TYPE)) {
// the target resource is KMZ
inputStream = new ZipInputStream(connection.getInputStream());
((ZipInputStream) inputStream).getNextEntry();
} else {
// the target resource is KML
inputStream = connection.getInputStream();
}
}
Document document = new SAXReader().read(inputStream);
inputStream.close();
return document;
}