Java使用XPath解析iTunes XML库

时间:2015-04-11 17:53:22

标签: java xml parsing xpath itunes

所以我正在尝试创建一个方法,允许我输入一个轨道ID,然后返回属于轨道ID的轨道名称。

我需要使用XPath将XML文档解析为java,然后将序列化新库。我的XML文档示例如下:

<plist version="1.0">
    <dict>
        <key>Major Version</key>
        <integer>1</integer>
        <key>Minor Version</key>
        <integer>1</integer>
        <key>Date</key>
        <date>2015-03-16T15:04:23Z</date>
        <key>Application Version</key>
        <string>12.1.0.71</string>
        <key>Features</key>
        <integer>5</integer>
        <key>Show Content Ratings</key>
        <true/>
        <key>Music Folder</key>
        <string>
        file://localhost/C:/Users/Mark/Music/iTunes/iTunes%20Media/
        </string>
        <key>Library Persistent ID</key>
        <string>3B01AE08EA513C21</string>
        <key>Tracks</key>
        <dict>
            <key>646</key>
            <dict>
            <key>Track ID</key>
            <integer>646</integer>
            <key>Name</key>
            <string>Save Me</string>
            <key>Artist</key>
            <string>Avenged Sevenfold</string>
            <key>Album Artist</key>
            <string>Avenged Sevenfold</string>
            <key>Album</key>
            <string>Nightmare</string>
            <key>Genre</key>
            <string>Metal</string>
            <key>Kind</key>
            <string>MPEG audio file</string>
            <key>Size</key>
            <integer>23257166</integer>
            <key>Total Time</key>
            <integer>656535</integer>
            <key>Disc Number</key>
            <integer>1</integer>
            <key>Disc Count</key>
            <integer>1</integer>
            <key>Track Number</key>
            <integer>11</integer>
            <key>Track Count</key>
            <integer>11</integer>
            <key>Year</key>
            <integer>2010</integer>
            <key>Date Modified</key>
            <date>2012-10-21T22:07:20Z</date>
            <key>Date Added</key>
            <date>2012-10-21T22:07:20Z</date>
            <key>Bit Rate</key>
            <integer>276</integer>
            <key>Sample Rate</key>
            <integer>44100</integer>
            <key>Play Count</key>
            <integer>2</integer>
            <key>Play Date</key>
            <integer>3415934327</integer>
            <key>Play Date UTC</key>
            <date>2012-03-30T06:38:47Z</date>
            <key>Artwork Count</key>
            <integer>1</integer>
            <key>Persistent ID</key>
            <string>0000000000001389</string>
            <key>Track Type</key>
            <string>File</string>
            <key>Location</key>
            <string>
            file://localhost/C:/Users/Mark/Music/Avenged%20Sevenfold/Nightmare/11%20-%20Save%20Me.mp3
            </string>
            <key>File Folder Count</key>
            <integer>2</integer>
            <key>Library Folder Count</key>
            <integer>1</integer>
        </dict>
        <key>648</key>
        <dict>
            <key>Track ID</key>
            <integer>648</integer>
            <key>Name</key>
            <string>Welcome 2 Hell</string>
            <key>Artist</key>
            <string>Bad Meets Evil</string>
            <key>Album Artist</key>
            <string>Bad Meets Evil</string>
            <key>Composer</key>
            <string>Havoc, Magnedo7</string>
            <key>Album</key>
            <string>Hell: The Sequel (Deluxe Edition)</string>
            <key>Genre</key>
            <string>Rap</string>
            <key>Kind</key>
            <string>MPEG audio file</string>
            <key>Size</key>
            <integer>7467977</integer>
            <key>Total Time</key>
            <integer>177606</integer>
            <key>Track Number</key>
            <integer>1</integer>
            <key>Year</key>
            <integer>2011</integer>
            <key>Date Modified</key>
            <date>2012-10-21T22:07:20Z</date>
            <key>Date Added</key>
            <date>2012-10-21T22:07:20Z</date>
            <key>Bit Rate</key>
            <integer>320</integer>
            <key>Sample Rate</key>
            <integer>44100</integer>
            <key>Play Count</key>
            <integer>3</integer>
            <key>Play Date</key>
            <integer>3424485861</integer>
            <key>Play Date UTC</key>
            <date>2012-07-07T06:04:21Z</date>
            <key>Skip Count</key>
            <integer>2</integer>
            <key>Skip Date</key>
            <date>2012-11-26T14:02:44Z</date>
            <key>Artwork Count</key>
            <integer>1</integer>
            <key>Persistent ID</key>
            <string>000000000000138A</string>
            <key>Track Type</key>
            <string>File</string>
            <key>Location</key>
            <string>
            file://localhost/C:/Users/Mark/Music/Bad%20Meets%20Evil/Hell_%20The%20Sequel%20(Deluxe%20Edition)/01%20-%20Welcome%202%20Hell.mp3
            </string>
            <key>File Folder Count</key>
            <integer>2</integer>
            <key>Library Folder Count</key>
            <integer>1</integer>
        </dict>
</plist>

现在,我对XPath和XML一般都很陌生,并且由于它的复杂性和庞大的规模,我正在努力浏览iTunes XML文件。

到目前为止,我的想法是导航到<key>646</key>以检查ID,然后使用<string>Save Me</string>导航到曲目名称"//dict/key[.="646"]/string[1]/text()"

这会产生NULL。到目前为止,我用Java编写的代码是:

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.xpath.XPath;
import javax.xml.xpath.XPathConstants;
import javax.xml.xpath.XPathExpression;
import javax.xml.xpath.XPathExpressionException;
import javax.xml.xpath.XPathFactory;

import org.w3c.dom.Document;
import org.w3c.dom.NodeList;
import org.xml.sax.SAXException;


public class XMLparse {

    public XMLparse(){
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setNamespaceAware(true);
        DocumentBuilder builder;
        Document doc = null;
        try {
            builder = factory.newDocumentBuilder();
            doc = builder.parse(new File("C:\\musicLibrary.xml"));

            // Create XPathFactory object
            XPathFactory xpathFactory = XPathFactory.newInstance();

            // Create XPath object
            XPath xpath = xpathFactory.newXPath();

            int id = 646;

            String name = getTrackNameById(doc, xpath, id);
            System.out.println("Track Name with ID " + id + ": " + name);

        } catch (ParserConfigurationException | SAXException | IOException e) {
            e.printStackTrace();
        }

    }

    public static String getTrackNameById(Document doc, XPath xpath, int id) {
        String name = null;
        try {
            XPathExpression expr = xpath.compile("//dict/integer[.="+id+"]/string[1]/text()");
            name = (String) expr.evaluate(doc, XPathConstants.STRING);
        } catch (XPathExpressionException e) {
            e.printStackTrace();
        }

        return name;
    }

}

非常感谢任何帮助。

编辑:

使用MathiasMüller的建议按预期为轨道646产生了正确的结果“Save Me”。但是,当我输入另一个曲目ID时,它也会返回“Save Me”,这是不正确的。

我不知道为什么会这样做,因为我认为它只会返回我输入的ID的曲目名称,但它会返回不同的曲目名称?

第二次编辑:

- 包含更多XML

第三次编辑:

使用Mathias的建议将XPath表达式更改为//dict[integer ="+id+"]/string[1]/text()。这很完美。

1 个答案:

答案 0 :(得分:1)

我无法评论Java代码,但我可以向您解释您应该使用的XPath表达式。假设您已显示的输入样本,请应用

//dict[key='646']/dict/key[. = 'Name']/following-sibling::*[1]

将返回

<string>Save Me</string>

这是你要找的元素。要仅选择其文本内容,请使用

//dict[key='646']/dict/key[. = 'Name']/following-sibling::*[1]/text()

,结果将是

Save Me

路径表达式的工作原理如下:

//dict                    select `dict` elements anywhere in the document
[key='646']               but only if they have an immediate child `key` whose text
                          content is equal to "646"
/dict                     select their child elements called `dict`
/key[. = 'Name']          of those `dict` elements select their child elements `key`,
                          but only if their text content is equal to "Name"
/following-sibling::*[1]  of those `key` elements, select the first following sibling
                          element
/text()                   and select its text content

原始表达式依赖于string元素的位置,也可以进行微小的更改:

//dict[key ="646"]/dict/string[1]/text()