如何从形状不好的xml中获取值

时间:2014-03-17 12:53:34

标签: java xml

我有以下字符串(你可以说是xml)

<News News-type="alert" ID="498" NewsPath="GetNewsFrom[3]" NewsMark="0" />
<News News-type="alert" ID="1507" NewsPath="GetNewsFrom[3]" NewsMark="0"/>
<News News-type="alert" ID="1509" NewsPath="GetNewsFrom[3]" NewsMark="0"/>
<News News-type="alert" ID="1511" NewsPath="GetNewsFrom[3]" NewsMark="0" />
<News News-type="alert" ID="1520" NewsPath="GetNewsFrom[3]" NewsMark="0" />
<News News-type="alert" ID="2999" NewsPath="data-theft[1]" NewsMark="0" />
<News News-type="alert" ID="2535" NewsPath="GetNewsFrom[3]" NewsMark="0" />
<News News-type="alert" ID="6052" NewsPath="GetNewsFrom[3]" NewsMark="100" />

我无法在其上应用xml阅读器/解析器,他们说它不是一个好的形式的xml文件。能否帮助我如何从这些字符串中获得以下输出

String attr[4]={"News-type","ID", "NewsPath", "NewsMark"};
String values[4];
//There values dynamically in array as well 
int i;
for(i=0; i<4;i++)
{
    if(i==0)
        value[i]=????;
    else if(i==1)
    ...
}

如何在values[]数组中获取所有属性值,以便我可以进一步使用它。

例外:
在java中将其作为xml文件传递     [致命错误]:2:2:根元素后面的文档中的标记必须格式正确。     2014年3月18日上午11:43:21 GUI.NewsReport jMenuItem2ActionPerformed     严重:空     org.xml.sax.SAXParseException; lineNumber:2; columnNumber:2;根元素后面的文档中的标记必须格式正确。         在com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)         在com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:347)         在javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:121)         在GUI.NewsReport.ReadXML(NewsReport.java:185)         在GUI.NewsReport.jMenuItem2ActionPerformed(NewsReport.java:126)         在GUI.NewsReport.access 100美元(NewsReport.java:33)         在GUI.NewsReport $ 2.actionPerformed(NewsReport.java:88)         在javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:2018)         在javax.swing.AbstractButton $ Handler.actionPerformed(AbstractButton.java:2341)         在javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:402)         在javax.swing.DefaultButtonModel.setPressed(DefaultButtonModel.java:259)         在javax.swing.AbstractButton.doClick(AbstractButton.java:376)         在javax.swing.plaf.basic.BasicMenuItemUI.doClick(BasicMenuItemUI.java:833)         在javax.swing.plaf.basic.BasicMenuItemUI $ Handler.mouseReleased(BasicMenuItemUI.java:877)         在java.awt.Component.processMouseEvent(Component.java:6505)         在javax.swing.JComponent.processMouseEvent(JComponent.java:3320)         at java.awt.Component.processEvent(Component.java:6270)         at java.awt.Container.processEvent(Container.java:2229)         at java.awt.Component.dispatchEventImpl(Component.java:4861)         at java.awt.Container.dispatchEventImpl(Container.java:2287)         at java.awt.Component.dispatchEvent(Component.java:4687)         at java.awt.LightweightDispatcher.retargetMouseEvent(Container.java:4832)         at java.awt.LightweightDispatcher.processMouseEvent(Container.java:4492)         at java.awt.LightweightDispatcher.dispatchEvent(Container.java:4422)         at java.awt.Container.dispatchEventImpl(Container.java:2273)         at java.awt.Window.dispatchEventImpl(Window.java:2719)         at java.awt.Component.dispatchEvent(Component.java:4687)         at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:735)         at java.awt.EventQueue.access $ 200(EventQueue.java:103)         at java.awt.EventQueue $ 3.run(EventQueue.java:694)         at java.awt.EventQueue $ 3.run(EventQueue.java:692)         at java.security.AccessController.doPrivileged(Native Method)         at java.security.ProtectionDomain $ 1.doIntersectionPrivilege(ProtectionDomain.java:76)         at java.security.ProtectionDomain $ 1.doIntersectionPrivilege(ProtectionDomain.java:87)         at java.awt.EventQueue $ 4.run(EventQueue.java:708)         at java.awt.EventQueue $ 4.run(EventQueue.java:706)         at java.security.AccessController.doPrivileged(Native Method)         at java.security.ProtectionDomain $ 1.doIntersectionPrivilege(ProtectionDomain.java:76)         at java.awt.EventQueue.dispatchEvent(EventQueue.java:705)         at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:242)         at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:161)         at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:150)         at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:146)         at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:138)         在java.awt.EventDispatchThread.run(EventDispatchThread.java:91)     我在执行时遇到了这个异常..

  • 谢谢分配!

3 个答案:

答案 0 :(得分:3)

没有单个根元素,因此它不是一个格式良好的XML文档......尽管它可能是格式良好的XML文档片段。

如果后者是真的,用Java解析它的最简单的解决方案是实现一个修改过的读取器,它在它周围包裹一个虚拟的顶层元素 - 在内容之前加上<wrapper>并跟随{例如{1}}。然后实现应用程序的其余部分,并意识到</wrapper>不是原始文件内容的一部分。

答案 1 :(得分:1)

在这种情况下,解决此问题的简单方法是向所有News标记添加父标记,然后像解析其他任何xml一样对其进行解析。

<NewsParent>
<News News-type="alert" ID="498" NewsPath="GetNewsFrom[3]" NewsMark="0" />
<News News-type="alert" ID="1507" NewsPath="GetNewsFrom[3]" NewsMark="0"/>
<News News-type="alert" ID="1509" NewsPath="GetNewsFrom[3]" NewsMark="0"/>
<News News-type="alert" ID="1511" NewsPath="GetNewsFrom[3]" NewsMark="0" />
<News News-type="alert" ID="1520" NewsPath="GetNewsFrom[3]" NewsMark="0" />
<News News-type="alert" ID="2999" NewsPath="data-theft[1]" NewsMark="0" />
<News News-type="alert" ID="2535" NewsPath="GetNewsFrom[3]" NewsMark="0" />
<News News-type="alert" ID="6052" NewsPath="GetNewsFrom[3]" NewsMark="100" />
</NewsParent>

答案 2 :(得分:0)

除了进行一些预处理(这应该比正则表达式更好)之外,另一种选择是使用正则表达式,例如:News-type=\\"([^\\"]+?)\\"\\s+ID=\\"([^\\"]+?)\\"\\s+NewsPath=\\"([^\\"]+?)\\"\\s+NewsMark=\\"([^\\"]+?)\\"

上述正则表达式应该与您所使用的相匹配,并将其放在以后可以访问的组中。

正则表达式的解释是here