这是一些基本的xml doc:
<h1>My Heading</h1>
<p align = "center"> My paragraph
<img src="smiley.gif" alt="Smiley face" height="42" width="42"></img>
<img src="sad.gif" alt="Sad face" height="45" width="45"></img>
<img src="funny.gif" alt="Funny face" height="48" width="48"></img>
</p>
<p>My para</p>
我想要做的是找到元素,他的所有属性以及为每个元素保存属性名称+属性值。到目前为止,这是我的代码:
private Map <String, String> tag = new HashMap <String,String> ();
public Map <String, String> findElement () {
try {
FileReader fRead = new FileReader (sourcePage);
BufferedReader bRead = new BufferedReader (fRead);
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance ();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder ();
Document doc = docBuilder.parse(new FileInputStream (new File (sourcePage)));
XPathFactory xFactory = XPathFactory.newInstance ();
XPath xPath = xFactory.newXPath ();
NodeList nl = (NodeList) xPath.evaluate("//img/@*", doc, XPathConstants.NODESET);
for( int i=0; i<nl.getLength (); i++) {
Attr attr = (Attr) nl.item(i);
String name = attr.getName();
String value = attr.getValue();
tag.put (name,value);
}
bRead.close ();
fRead.close ();
}
catch (Exception e) {
e.printStackTrace();
System.err.println ("An error has occured.");
}
当我寻找img的属性时出现问题,因为属性相同。 HashMap不适合这种情况,因为它使用相同的密钥覆盖值。也许我正在使用错误的表达式来查找所有属性。还有其他方法,如何获取第n个img元素的属性名称和值?
答案 0 :(得分:1)
首先,让我们对场地进行一点调整。我稍微清理了你的代码以获得编译起点。我删除了不必要的代码并通过我最好的猜测修复了它应该做的事情。然后我将它稍微加以使其接受一个tagName
参数。它仍然是相同的代码并犯同样的错误,但现在它编译(为方便起见使用Java 7功能,如果你愿意,可以将它切换回Java 6)。我还将try-catch
分成多个块,只是为了它:
public Map<String, String> getElementAttributesByTagName(String tagName) {
Document document;
try (InputStream input = new FileInputStream(sourcePage)) {
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
document = docBuilder.parse(input);
} catch (IOException | ParserConfigurationException | SAXException e) {
throw new RuntimeException(e);
}
NodeList attributeList;
try {
XPath xPath = XPathFactory.newInstance().newXPath();
attributeList = (NodeList)xPath.evaluate("//descendant::" + tagName + "[1]/@*", document, XPathConstants.NODESET);
} catch (XPathExpressionException e) {
throw new RuntimeException(e);
}
Map<String, String> tagInfo = new HashMap<>();
for (int i = 0; i < attributeList.getLength(); i++) {
Attr attribute = (Attr)attributeList.item(i);
tagInfo.put(attribute.getName(), attribute.getValue());
}
return tagInfo;
}
当针对上面的示例代码运行时,它返回:
{height=48, alt=Funny face, width=48, src=funny.gif}
解决方案取决于您的预期输出。你要么
<img>
元素(例如,第一个)的属性<img>
元素及其属性的列表对于第一个解决方案,将XPath表达式更改为
就足够了//descendant::img[1]/@*
或
//descendant::" + tagName + "[1]/@*
带有tagName
参数的。请注意,即使在此特定情况下返回相同的元素, 也不会与//img[1]/@*
相同。
以这种方式更改时,方法返回:
{height=42, alt=Smiley face, width=42, src=smiley.gif}
是第一个<img>
元素的正确返回属性。
请注意,您甚至不必使用XPath表达式来完成此类工作。这是一个非XPath版本:
public Map<String, String> getElementAttributesByTagNameNoXPath(String tagName) {
Document document;
try (InputStream input = new FileInputStream(sourcePage)) {
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
document = docBuilder.parse(input);
} catch (IOException | ParserConfigurationException | SAXException e) {
throw new RuntimeException(e);
}
Node node = document.getElementsByTagName(tagName).item(0);
NamedNodeMap attributeMap = node.getAttributes();
Map<String, String> tagInfo = new HashMap<>();
for (int i = 0; i < attributeMap.getLength(); i++) {
Node attribute = attributeMap.item(i);
tagInfo.put(attribute.getNodeName(), attribute.getNodeValue());
}
return tagInfo;
}
第二个解决方案需要稍微改变一下。我们想要返回文档中所有<img>
元素的属性。多个元素意味着我们将使用List
来保存多个Map<String, String>
个实例,其中每个Map
代表一个<img>
元素。
一个完整的XPath版本,以防您真正需要一些复杂的XPath表达式:
public List<Map<String, String>> getElementsAttributesByTagName(String tagName) {
Document document;
try (InputStream input = new FileInputStream(sourcePage)) {
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docBuilderFactory.newDocumentBuilder();
document = docBuilder.parse(input);
} catch (IOException | ParserConfigurationException | SAXException e) {
throw new RuntimeException(e);
}
NodeList nodeList;
try {
XPath xPath = XPathFactory.newInstance().newXPath();
nodeList = (NodeList)xPath.evaluate("//" + tagName, document, XPathConstants.NODESET);
} catch (XPathExpressionException e) {
throw new RuntimeException(e);
}
List<Map<String, String>> tagInfoList = new ArrayList<>();
for (int i = 0; i < nodeList.getLength(); i++) {
Node node = nodeList.item(i);
NamedNodeMap attributeMap = node.getAttributes();
Map<String, String> tagInfo = new HashMap<>();
for (int j = 0; j < attributeMap.getLength(); j++) {
Node attribute = attributeMap.item(j);
tagInfo.put(attribute.getNodeName(), attribute.getNodeValue());
}
tagInfoList.add(tagInfo);
}
return tagInfoList;
}
要摆脱XPath部分,您只需将其切换为单行:
NodeList nodeList = document.getElementsByTagName(tagName);
这两个版本在使用"img"
参数对您的测试用例运行时,返回此:(为清晰起见而格式化)
[ {height=42, alt=Smiley face, width=42, src=smiley.gif},
{height=45, alt=Sad face, width=45, src=sad.gif },
{height=48, alt=Funny face, width=48, src=funny.gif } ]
这是所有<img>
元素的正确列表。
答案 1 :(得分:0)
尝试使用
Map <String, ArrayList<String>> tag = new HashMap <String, ArrayList<String>> ();
答案 2 :(得分:0)
您可以在地图中使用地图:
Map<Map<int, String>, String> // int = "some index" 0,1,etc.. & String1(the value of the second Map) =src & String2(the value of the original Map) =smiley.gif
或强>
你可以反过来考虑使用它,比如:
Map<String, String> // String1=key=smiley.gif , String2=value=src