我正在尝试使用jDOM拆分大型XML文件(500mb)(我知道我应该尝试使用SAX但是......)但是我得到了org.jdom.IllegalAddException:内容已经有了一个现有的父“root”例外如下面的代码所示。
示例xml和代码如下。我相信所有的索引检查和其他琐碎的东西都是正确的。
感谢!!!
首先抱歉大量的代码。
<root>
<metadata><md1>...</md1><md2>...</md2><metadata>
<someOtherInfo><soi_1>...</soi_1></someOtherInfo>
<collection>
<item id="1">...</item><item id="2">...</item><item id="2">...</item>
</collection>
</root>
split() {
final String[] nodeNames = XmlUtils.getNodeNames(elementXpath); // returns {root, collection, item}
// creates tree of
//<root>
// <metadata><md1>...</md1><md2>...</md2><metadata>
// <someOtherInfo><soi_1>...</soi_1></someOtherInfo>
// <collection>
final Element originalDestination = importNodes(sourceDocument, nodeNames);
Element destination = null;
// traverses to "collection" element
Element source = sourceDocument.getRootElement();
for (int tempCount = 1; tempCount < nodeNames.length - 1; ++tempCount) {
source = source.getChild(nodeNames[tempCount]);
}
// get all "collection/item" elements
for (Object obj : source.getChildren(nodeNames[nodeNames.length - 1])) {
// makes sure that each out file has batchSize no of elements
if (groupCount % batchSize == 0) {
if (destination != null) {
// traverse and go back up to the root
Element root = destination;
while (root.getParentElement() != null) {
root = root.getParentElement();
}
// this is where I get -- org.jdom.IllegalAddException: The Content already has an existing parent "root" -- exception
final Document destDocument = new Document(destination);
// write file to disk and reset counters
} else {
// create complete clone of originalDestination so that even its parents are cloned
destination = createClone(originalDestination, nodeNames);
}
}
// add this "item" element to destination "collection" element
final Element element = (Element) obj;
destination.addContent(((Element) element.clone()));
count++;
groupCount++;
}
if (groupCount > 0) {
// write remaining "items" to file
}
}
private Element createClone(final Element source, final String[] nodeNames) {
Element destination = source;
while (destination.getParentElement() != null) {
destination = destination.getParentElement();
}
destination = (Element) destination.clone();
for (int tempCount = 1; tempCount < nodeNames.length - 1; ++tempCount) {
destination = destination.getChild(nodeNames[tempCount]);
}
return destination;
}
private Element importNodes(final Document document,
final String[] nodeNames) {
Element source = document.getRootElement();
if (!source.getName().equals(nodeNames[0])) {
return null;
}
Element destination = null;
for (int count = 0; count < (nodeNames.length - 1); count++) {
if (count > 0) {
source = source.getChild(nodeNames[count]);
}
final Element child = new Element(nodeNames[count]);
if (destination != null) {
destination.setContent(child);
}
destination = child;
// copy attributes -- don't want to clone here since this is one of the ancestors of "item"
for (Object objAttb : source.getAttributes()) {
Attribute attb = (Attribute) objAttb;
destination.setAttribute(attb.getName(), attb.getValue());
}
// this is for <metadata> and <soneInfo> elements
for (Object obj : source.getChildren()) {
final Element childToClone = (Element) obj;
if (!childToClone.getName().equals(nodeNames[count + 1])
&& (ignoreWhiteSpaceNodes ? !childToClone.getName()
.equals("#text") : true)) {
final Element clone = (Element) childToClone.clone();
destination.addContent(clone);
}
}
}
return destination;
}
答案 0 :(得分:10)
在将元素插入另一个文档之前,您只需要将元素与其父元素分离。
答案 1 :(得分:9)
在JDOM实现中,每个元素都链接到parent:在新目标中添加元素之前,必须从原始结构中分离元素。
Element elemCopy = (Element)element.clone();
elemCopy.detach();
destination.addContent(elemCopy);
答案 2 :(得分:2)
替换JDOM中的元素:
element.removeContent();
int size = frEl.getContentSize();
for(int count = 0; count < size; count++) {
element.addContent(frEl.getContent(0).detach());
}
答案 3 :(得分:1)
如果是元素列表,您可能需要执行与以下类似的操作。
for (int count = 0; count < resultEle.size(); count++) {
destDocument.getRootElement().getChild("result").addContent(resultEle.get(count).detach());
}