Question

我想从xml文件中解析一个非常长的字符串。您可以看到xml文件here。如果您访问上述文件，则我需要解析该字符串"description" tag。当有一个短的短字符串，比如说"description" tag中的3行或4行字符串时，我的解析器（Java SAX解析器）很容易解析字符串但是，当字符串是数百行时，我的解析器不能解析字符串。您可以检查我用于解析的代码，并告诉我在这方面我出错的地方。在这方面请帮助我，我会非常感谢你的善举。

这是解析器GetterSetter类

public class MyGetterSetter 
{
    private ArrayList<String> description = new ArrayList<String>();


        public ArrayList<String> getDescription()
        { 
            return description;
        }

        public void setDescription(String description) 
        { 


            this.description.add(description);
        }
}

这是解析器Handler类

public class MyHandler extends DefaultHandler 
{
    String elementValue = null;
    Boolean elementOn = false;
    Boolean item = false;

    public static MyGetterSetter data = null;

    public static MyGetterSetter getXMLData() 
    {
        return data;
    }

    public static void setXMLData(MyGetterSetter data) 
    {
        MyHandler.data = data;
    }


    public void startDocument() throws SAXException
    {
        data =  new MyGetterSetter();
    }

    public void endDocument() throws SAXException
    {

    }

    public void startElement(String namespaceURI, String localName,String qName, Attributes atts) throws SAXException
    {
        elementOn = true;

        if (localName.equalsIgnoreCase("item"))
        item = true;
    }

    public void endElement(String namespaceURI, String localName, String qName) throws SAXException
    {
        elementOn = false;

        if(item)
        {

            if (localName.equalsIgnoreCase("description"))
                {   
                data.setDescription(elementValue);


                Log.d("--------DESCRIPTION------", elementValue +" ");

                }


            else if (localName.equalsIgnoreCase("item")) item = false;
        }



    }

    public void characters(char ch[], int start, int length)
    {
        if (elementOn) 
        {
            elementValue = new String(ch, start, length);
            elementOn = false;
        }
    }



}

Answer 1

使用org.w3c.dom包。

public static void main(String[] args) {
    try {
        URL url = new URL("http://www.aboutsports.co.uk/fixtures/");

        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
        Document doc = dBuilder.parse(url.openStream());

        NodeList list = doc.getElementsByTagName("item"); // get <item> nodes

        for (int i = 0; i < list.getLength(); i++) {
            Node item = list.item(i);
            NodeList descriptions = ((Element)item).getElementsByTagName("description"); // get <description> nodes within an <item>
            for (int j = 0; j < descriptions.getLength(); j++) {
                Node description = descriptions.item(0);

                System.out.println(description.getTextContent()); // print the text content
            }
        }

    } catch (Exception e) {
        e.printStackTrace();
    }
}

java中的

XPath也非常适合从XML文档中提取位。 Here's一个例子。

您可以使用XPathExpression之类的/item/description。当您在XML InputStream上对其进行评估时，它将返回NodeList，如上所述<description>元素中的所有<item>元素。

如果您想按照DefaultHandler的方式进行操作，则需要设置和取消设置标记，以便检查是否在<document>元素的主体中。上面的代码可能在内部执行类似的操作，将其隐藏起来。代码在java中可用，为什么不使用呢？

如何解析Android应用程序中的在线xml文件中的长字符串？

1 个答案: