从网上读取xml

时间:2018-05-24 13:08:19

标签: java xml

我正在从特定网址读取xml。但是我收到了这个错误

  

[致命错误]:3:24:属性“http-equiv”需要打开引号   与元素类型“META”相关联。

xml缺少编码UTF-8,我添加了它,但我仍然收到此错误。 非常感谢帮助。

这是我的代码:

import java.io.IOException;
import java.io.OutputStreamWriter;
import java.io.Writer;
import java.net.ServerSocket;
import java.net.Socket;
import java.net.URL;
import java.net.URLConnection;


import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import org.w3c.dom.Document;
import org.xml.sax.SAXException;

public class crawleycraw {

    public static void main(String[] args) throws IOException, TransformerException, SAXException, ParserConfigurationException {
        // TODO Auto-generated method stub
        String urlString = "http://www.bnb.bg/";
        URL url = new URL(urlString);
        URLConnection conn = url.openConnection();

        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = factory.newDocumentBuilder();
        Document doc = builder.parse(conn.getInputStream());

        TransformerFactory factoryl = TransformerFactory.newInstance();
        Transformer xform = factoryl.newTransformer();

        Transformer transformer = null;
        transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION,"no");

        xform.transform(new DOMSource(doc), new StreamResult(System.out));
    }

}

2 个答案:

答案 0 :(得分:1)

网址http://www.bnb.bg/会返回HTML,而非 XML 。因此,当你解析它时,它将抛出错误,因为它是HTML,而不是XML。

您的代码还有其他问题以及Srinevu

的答案
  

即使您使用curl或wget或浏览器下载上述URL,也请保存   作为example.xml,使用任何XML编辑器,你会看到完全相同的错误   正在通过Java解析器看到。

String urlString = "http://www.bnb.bg/";

答案 1 :(得分:0)

您的代码很好,除了将属性设置为null(变换器)..您可能会遇到来自URL的响应问题。在这里我尝试使用简单的xml字符串,它工作正常..

public static void main(String[] args) throws Exception {
        String urlString = "<Customers><Customer Name=\"Test_91\" Code=\"91\"/><Customer Name=\"Test_92\" Code=\"92\"/></Customers>";
        // URL url = new URL(urlString);
        // URLConnection conn = url.openConnection();
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = factory.newDocumentBuilder();
        Document doc = builder.parse(new ByteArrayInputStream(urlString.getBytes()));
        TransformerFactory factoryl = TransformerFactory.newInstance();
        Transformer xform = factoryl.newTransformer();
        // Transformer transformer = null;
        xform.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
        xform.transform(new DOMSource(doc), new StreamResult(System.out));
    }