在Windows 8手机应用程序中解析c#中的复杂xml CDATA

时间:2013-11-24 00:31:53

标签: c# xml windows-phone-8 linq-to-xml cdata

我正在尝试读取XML格式的一些数据,这是我的Windows 8手机应用程序中的CDATA。以下是数据样本:

<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE HolyQuran [
<!ATTLIST HolyQuran TranslationID CDATA #REQUIRED>
<!ATTLIST HolyQuran Writer CDATA #REQUIRED>
<!ATTLIST HolyQuran Language CDATA #REQUIRED>
<!ATTLIST HolyQuran LanguageIsoCode CDATA #REQUIRED>
<!ATTLIST HolyQuran Direction (rtl|ltr) #REQUIRED>
<!ELEMENT HolyQuran (Chapter+)>
<!ATTLIST Chapter ChapterID CDATA #REQUIRED>
<!ATTLIST Chapter ChapterName CDATA #REQUIRED>
<!ELEMENT Chapter (Verse+)>
<!ATTLIST Verse VerseID CDATA #REQUIRED>
<!ELEMENT Verse (#PCDATA)>
  ]>
<!-- This SQL Query Generated at 22 November 2013 01:44 (UTC) from
  www.qurandatabase.org -->
<HolyQuran TranslationID="59" Writer="Yusuf Ali" Language="English"
    LanguageIsoCode="eng" Direction="ltr">
<Chapter ChapterID="1" ChapterName="The Opening">
    <Verse VerseID="1"><![CDATA[In the name of Allah, Most Gracious, Most
                          Merciful.]]></Verse>
    <Verse VerseID="2"><![CDATA[Praise be to Allah, the Cherisher and Sustainer
                          of the worlds;]]></Verse>
    <Verse VerseID="3"><![CDATA[Most Gracious, Most Merciful;]]></Verse>
    <Verse VerseID="4"><![CDATA[Master of the Day of Judgment.]]></Verse>
    <Verse VerseID="5"><![CDATA[Thee do we worship, and Thine aid we seek.
                         ]]></Verse>
    <Verse VerseID="6"><![CDATA[Show us the straight way,]]></Verse>
    <Verse VerseID="7"><![CDATA[The way of those on whom Thou hast bestowed Thy
                         Grace, those whose (portion) is not wrath, and who go
                         not astray.]]></Verse>
</Chapter>
<Chapter ChapterID="114" ChapterName="The Men">
<Verse VerseID="1"><![CDATA[Say: I seek refuge with the Lord and Cherisher 
             of Mankind,]]></Verse>
<Verse VerseID="2"><![CDATA[The King (or Ruler) of Mankind,]]></Verse>
<Verse VerseID="3"><![CDATA[The god (or judge) of Mankind,-]]></Verse>
<Verse VerseID="4"><![CDATA[From the mischief of the Whisperer (of Evil), who 
                         withdraws (after his whisper),-]]></Verse>
<Verse VerseID="5"><![CDATA[(The same) who whispers into the hearts of Mankind,-]]>  
    </Verse>
<Verse VerseID="6"><![CDATA[Among Jinns and among men.]]></Verse>
</Chapter>
</HolyQuran>

我想得到一个数据结构,其中包含整本书,其子章节的子数据结构包含ChapterName,ChapterID和所有经文内容的列表及其特定章节的相应VerseID。请注意,根据诗歌内容,我的意思是CDATA。我需要使用XDocument,但我无法弄清楚如何解析这个复杂的XML。

我将非常感谢任何帮助!

谢谢!

1 个答案:

答案 0 :(得分:1)

最简单的方法是使用XML序列化:定义与XML文档结构匹配的类,使用描述XML模式的属性,并使用XmlSerializer类来解析输入。

在你的情况下,类看起来像这样:

public class HolyQuran
{
    [XmlAttribute]
    public int TranslationID { get; set; }
    [XmlAttribute]
    public string Writer { get; set; }
    [XmlAttribute]
    public string Language { get; set; }
    [XmlAttribute]
    public string LangIsoCode { get; set; }
    [XmlAttribute]
    public string Direction { get; set; }
    [XmlElement("Chapter")]
    public List<Chapter> Chapters { get; set; }
}

public class Chapter
{
    [XmlAttribute]
    public int ChapterID { get; set; }
    [XmlAttribute]
    public string ChapterName { get; set; }
    [XmlElement("Verse")]
    public List<Verse> Verses { get; set; }
}

public class Verse
{
    [XmlAttribute]
    public int VerseId { get; set; }
    [XmlText]
    public string Text { get; set; }
}

您可以使用以下代码来解析文件:

static HolyQuran LoadQuran(string path)
{
    var readerSettings = new XmlReaderSettings { DtdProcessing = DtdProcessing.Ignore };
    using (var reader = XmlReader.Create(path, readerSettings))
    {
        var xs = new XmlSerializer(typeof(HolyQuran));
        return (HolyQuran)xs.Deserialize(reader);
    }
}

你不必做任何特殊的事情来解析CDATA部分,XmlReader已经知道如何处理它们。