我必须阅读一个大的xml文件(A.xml
)并创建一个新的xml文件(B.xml
),其内容与A.xml
相同,除了一些需要的属性值已在B.xml
中更新。
例如,如果A.xml
是:
<?xml version="1.0" encoding="utf-8"?>
<one>
<!-- comment -->
<a att="hello" />
</one>
<two />
我希望B.xml
包含:
<?xml version="1.0" encoding="utf-8"?>
<one>
<!-- comment -->
<a att="UPDATED" />
</one>
<two />
我正在查看使用SAX进行解析的str2double
和用于写入的PrintWriter
,但它看起来很低级别,我不知道是否可以复制注释并保留此类型封闭标签:/>
。
我更喜欢流式传输解析器,而不是将整个文档加载到内存中,但我愿意接受建议。
答案 0 :(得分:2)
对于流式传输解决方案,您可以使用javax.xml.stream.XMLStreamReader
或XMLEventReader
来读取XML文档,更新要更改的任何部分,并将数据/事件从阅读器传输到{{{ 1}}或javax.xml.stream.XMLStreamWriter
。
答案 1 :(得分:2)
我没有看到为什么你不喜欢将xml文档保存在内存中的原因,除非你使用的xml文件很大(100+ MB)。
我有两种方法可以解决这个问题:
逐个字符地读取文件并更改需要更改的内容。这符合您的要求,但实施缓慢且难以实现。
使用xml解析器,找到您要查找的元素并进行更改。我倾向于这个。
第一种方法是逐个字符地读取xml文件,查找要查找的标记,更改它们并在执行此操作时将xml写入第二个文件。这是非常流线型的,但是,它可以在标签中包含标签,这样可以非常快速地复杂化。您可以使用解析器实现此目的,但这可能涉及将文档保留在内存中。
第二个很简单。使用xml解析器解析文件,迭代元素,更改它们,最后将编辑后的xml写回文件。这涉及将文档保存在内存中,但除非您使用内存受限的计算机或文档很大(100+ MB),否则这不是一个真正的问题。
我不会在这里写出一个完整的程序,也不会给出第一种方式的例子(无论如何都要过时复杂),我会给你一个第二种方式的起点。
您的目的是什么:
使用Java 8更新65
编写需要库:Dom4J用于xml解析器。
public class Main {
private static final Scanner SCANNER = new Scanner(System.in);
/**
* The file we're reading from.
*/
private File inputFile;
/**
* The file we're writing to.
*/
private File outputFile;
/**
* The attributes to replace.
*/
private List<UserAttribute> attributes = new ArrayList<>();
private Main() {
getFiles();
getReplacementTags();
}
private void getFiles() {
System.out.println("Please enter the input file...");
String input = SCANNER.nextLine();
File inFile = new File(input);
if (!inFile.exists() || !inFile.isFile()) {
System.err.println("The file you entered doesn't exits or isn't a file!");
System.exit(1);
}
inputFile = inFile;
System.out.println("Please enter the output file...");
String output = SCANNER.nextLine();
File outFile = new File(output);
if (!outFile.exists()) {
try {
outFile.createNewFile();
System.out.println("Created file: " + outFile);
} catch (IOException ex) {
System.err.println("Couldn't create the output file!");
System.exit(2);
}
}
outputFile = outFile;
}
private void getReplacementTags() {
System.out.println("Enter the tags you wish to replace");
System.out.println("The format is &element name &attribute &replacement. (e.g. &one &a att &UPDATED!)");
System.out.println("Enter a list of tags you wish to replace with each in a new line. Enter # when finished.");
while (true) {//I'm using an infinate loop because it just seams easier to implement.
String line = SCANNER.nextLine();
if (line.equals("#")) {
break;
}
try {
UserAttribute attribute = getAttributeFromUserText(line);
this.attributes.add(attribute);
System.out.println("Added attribute replacement: " + attribute);
} catch (IllegalArgumentException ex) {
System.err.println("Incorrect attribute format: \n\t" + ex.getMessage());
}
}
startReplacing();
}
private void startReplacing() {
@SuppressWarnings("UnusedAssignment")
Document doc = null;
try {
doc = new SAXReader().read(inputFile);
} catch (DocumentException ex) {
System.err.println("Coundn't read xml file: " + ex.getMessage());
System.exit(3);
}
replaceAttributes(doc);
try (FileWriter writer = new FileWriter(outputFile)) {
doc.write(writer);
System.out.println("Saved xml document to file: " + outputFile);
} catch (IOException ex) {
System.err.println("Couldn't write to file: " + ex.getMessage());
}
}
/**
* This does all the magic.
*
* You might want to fix this up as I'm sure it's rather slow. This only
* scans 1 tag deep.
*/
private void replaceAttributes(Document doc) {
for (UserAttribute uattribute : attributes) {
Element root = doc.getRootElement();
for (Iterator i = root.elementIterator(); i.hasNext();) {
Element element = (Element) i.next();
if (element.getName().equals(uattribute.element)) {
for (Iterator i1 = element.attributeIterator(); i1.hasNext();) {
Attribute attribute = (Attribute) i1.next();
if(attribute.getName().equals(uattribute.attribute)){
attribute.setValue(uattribute.replacement);
}
}
}
}
}
}
public static void main(String[] args) {
Main m = new Main();
}
private static UserAttribute getAttributeFromUserText(String text) throws IllegalArgumentException {//This is a bit incomplete...
String[] split = text.split("&");
if (split.length != 4) {
throw new IllegalArgumentException("Incorrect number of arguments!");
}
return new UserAttribute(split[1].replace(" ", ""), split[2].replace(" ", ""), split[3]);
}
private static final class UserAttribute {
public final String element;
public final String attribute;
public final String replacement;
public UserAttribute(String element, String attribute, String replacement) {
this.element = element;
this.attribute = attribute;
this.replacement = replacement;
}
public String getElement() {
return element;
}
public String getAttribute() {
return attribute;
}
public String getReplacement() {
return replacement;
}
@Override
public String toString() {
return String.format("{element=%s, attribute=%s, replacement=%s}", element, attribute, replacement);
}
}
}
A.XML
<?xml version="1.0" encoding="UTF-8"?>
<root>
<PersonA name="Jenny" age="22">
<!-- A Random Comment -->
<friends number="3">
Friend A,
Friend B,
Friend C
</friends>
</PersonA>
<PersonB name="Bob" age="44">
<!-- A Random Comment... again -->
<friends number="5">
Friend A,
Friend B,
Friend C,
Friend D,
Friend E
</friends>
</PersonB>
</root>
B.XML
<?xml version="1.0" encoding="UTF-8"?>
<root>
<PersonA name="Joe" age="41">
<!-- A Random Comment -->
<friends number="3">
Friend A,
Friend B,
Friend C
</friends>
</PersonA>
<PersonB name="Ashley" age="32">
<!-- A Random Comment... again -->
<friends number="5">
Friend A,
Friend B,
Friend C,
Friend D,
Friend E
</friends>
</PersonB>
</root>
参数
run:
Please enter the input file...
A.xml
Please enter the output file...
B.xml
Enter the tags you wish to replace
The format is &element name &attribute &replacement. (e.g. &one &a att &UPDATED!)
Enter a list of tags you wish to replace with each in a new line. Enter # when finished.
&PersonA &name &Joe
Added attribute replacement: {element=PersonA, attribute=name, replacement=Joe}
&PersonA &age &41
Added attribute replacement: {element=PersonA, attribute=age, replacement=41}
&PersonB &name &Ashley
Added attribute replacement: {element=PersonB, attribute=name, replacement=Ashley}
&PersonB &age &32
Added attribute replacement: {element=PersonB, attribute=age, replacement=32}
#
Saved xml document to file: B.xml
BUILD SUCCESSFUL (total time: 1 minute 32 seconds)
这几乎可以满足你所要求的一切,唯一的问题是:
虽然问题不谈,但这应该让你先行一步......我希望。
P.S。对不起任何拼写错误,大错,格式错误。我确实在短时间内写了这个,没有做太多测试。如果您发现错误,请发表评论。
答案 2 :(得分:1)
更新用例的最佳XML解析器是不合理的VTD-XML ...出于以下两个原因:
阅读this paper以获取更多信息:标题为&#34; 使用Java处理XML - 绩效基准 &#34;。