用JSoup编写XML

时间:2015-03-04 21:12:00

标签: java xml jsoup

我用JSoup解析了一个xml文件,现在我想将(修改过的)对象写入一个新的xml文件。

问题是JSoup添加了一堆元头html数据。

应该这样开始:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE score-partwise PUBLIC "-//Recordare//DTD MusicXML 2.0 Partwise//EN" "http://www.musicxml.org/dtds/partwise.dtd">
<score-partwise>
  <identification>
    <encoding>

但它实际上是这样开始的:

<!--?xml version="1.0" encoding="UTF-8"?--><!DOCTYPE score-partwise PUBLIC "-//Recordare//DTD MusicXML 2.0 Partwise//EN" "http://www.musicxml.org/dtds/partwise.dtd">
<html>
 <head></head>
 <body>
  <score-partwise> 
   <identification> 
    <encoding> 
     <software>
      MuseScore 1.3
     </software> 
     <encoding-date>
      2015-01-31
     </encoding-date> 
    </encoding> 
    <source>http://musescore.com/score/161981 
   </identification> 
   <defaults> 
    <scaling> 
     <millimeters>
      7.056
     </millimeters> 
     <tenths>
      40
     </tenths> 
    </scaling> 
    <page-layout> 
     <page-height>
      1683.67
     </page-height> 
     <page-width>
      1190.48
     </page-width> 

我已经加载了这样的文件:

 if (doc.getElementsByTag("note").isEmpty()) {
        doc = Jsoup.parse(input, "UTF-16", filename);
        if (doc.getElementsByTag("note").isEmpty()) {
            System.out.println("Please check that your file is encoded in UTF-8 or UTF-16 and contains notes.");
        }
    }

并尝试过这样写:

BufferedWriter htmlWriter = new BufferedWriter(new OutputStreamWriter(new FileOutputStream("output.xml"), "UTF-8"));
        htmlWriter.write(doc.outerHtml());

- &GT;我也尝试过doc.html()和doc.toString()。产量仍然相同。

有什么想法吗?我只是希望它以与阅读相同的方式编写。

1 个答案:

答案 0 :(得分:1)

这解决了它:

InputStream is = new FileInputStream(filename) {
    @Override
    public int read() throws IOException {
        return 0;
    }
};

  doc = Jsoup.parse(is, "UTF-8", "", Parser.xmlParser());


if (doc.getElementsByTag("note").isEmpty()) {
    doc = Jsoup.parse(is, "UTF-8", "", Parser.xmlParser());
    if (doc.getElementsByTag("note").isEmpty()) {
        System.out.println("Please check that your file is encoded in UTF-8 or UTF-16 and contains notes.");
    }
}