解析xml创建数据集

时间:2015-03-27 01:34:06

标签: c# xml winforms linq dataset

基于此.xml,我尝试从每篇文章中获取照片:

    <?xml version="1.0" encoding="UTF-8"?>
    <articles>
        <article hint="0">      
            <id>498940</id>
            <type>1</type>
            <category>International</category>
            <title>
                <![CDATA[News title 1]]>
            </title>
            <description>
                <![CDATA[News Description 2]]>
            </description>  
            <content>   
                <photos is3idfp="CMS_12_498940" is3fechapub="2015-03-26 15:53:54">
                    <photo>                 
                        <photoURL>http://static01.nyt.com/images/2015/03/27/world/27IRAQ/27IRAQ-master675.jpg</photoURL>
                        <photodescription>
                            <![CDATA[U.S. Airstrikes on ISIS in Tikrit Prompt Boycott by Shiite Fighters]]>
                        </photodescription>
                    </photo>
                    <photo>
                        <photoURL>http://static01.nyt.com/images/2015/03/26/world/alps-web/26plane10-master675.jpg</photoURL>
                        <photodescription>
                            <![CDATA[Challenges Weigh Heavily on Recovery Efforts in Germanwings Crash]]>
                        </photodescription>
                    </photo>
                    <photo>
                        <photoURL>http://static01.nyt.com/images/2015/03/26/world/26Yemen3/26Yemen3-master180.jpg</photoURL>
                        <photodescription>
                            <![CDATA[Saudi Arabia Leads Air Assault in Yemen]]>
                        </photodescription>
                    </photo>            
                </photos>
            </content>      
        </article>
        <article hint="0">      
            <id>498941</id>
            <type>5</type>
            <title>
                <![CDATA[Advertisement]]>
            </title>
<urlAd>http://ads.google.com/RealMedia/ads/adstream_nx.ads/(random)@x31</urlAd>
        </article>
        <article hint="0">      
            <id>498940</id>
            <type>1</type>
            <category>International</category>
            <title>
                <![CDATA[News title 2]]>
            </title>
            <description>
                    <![CDATA[News Description 2]]>
            </description>              
            <content>               
                <photos is3idfp="CMS_12_498940" is3fechapub="2015-03-26 15:53:54">
                    <photo>                 
                        <photoURL>http://static01.nyt.com/images/2015/03/27/sports/Y-JACKSON/Y-JACKSON-master675.jpg</photoURL>
                        <photodescription>
                            <![CDATA[Wisconsin Guard Carries an N.B.A. Pedigree, but Is Inspired by His Mother]]>
                        </photodescription>
                    </photo>
                    <photo>
                        <photoURL>http://static01.nyt.com/images/2015/03/27/sports/LOVE/LOVE-master675.jpg</photoURL>
                        <photodescription>
                            <![CDATA[Kevin Love Shows What He Can Do as Cavaliers See What They Can Be]]>
                        </photodescription>
                    </photo>
                    <photo>
                        <photoURL>http://static01.nyt.com/images/2015/03/26/sports/CITY-KNICKS/CITY-KNICKS-blog427.jpg</photoURL>
                        <photodescription>
                            <![CDATA[Knicks Approach a Franchise Record With a Pounding From the Clippers]]>
                        </photodescription>
                    </photo>            
                </photos>
            </content>      
        </article>
    </articles>

从xml获取数据集:

string sourceXML = "http://mydomain/myxmlfile.xml";
XmlReader xmlFile = XmlReader.Create(sourceXML, new XmlReaderSettings());
DataSet ds = new DataSet();
ds.ReadXml(xmlFile);
我得到四张桌子:

ds.Tables[0].TableName, article
ds.Tables[1].TableName, content
ds.Tables[2].TableName, photos
ds.Tables[3].TableName, photo

所以我尝试解析,我得到的文章,但所有的图片:

1 title: News title 1           
1 title: News Description 2         
http://static01.nyt.com/images/2015/03/27/world/27IRAQ/27IRAQ-master675.jpg
http://static01.nyt.com/images/2015/03/26/world/alps-web/26plane10-master675.jpg
http://static01.nyt.com/images/2015/03/26/world/26Yemen3/26Yemen3-master180.jpg
http://static01.nyt.com/images/2015/03/27/sports/Y-JACKSON/Y-JACKSON-master675.jpg
http://static01.nyt.com/images/2015/03/27/sports/LOVE/LOVE-master675.jpg
http://static01.nyt.com/images/2015/03/26/sports/CITY-KNICKS/CITY-KNICKS-blog427.jpg
1 title: News title 2           
1 title: News Description 2         
http://static01.nyt.com/images/2015/03/27/world/27IRAQ/27IRAQ-master675.jpg
http://static01.nyt.com/images/2015/03/26/world/alps-web/26plane10-master675.jpg
http://static01.nyt.com/images/2015/03/26/world/26Yemen3/26Yemen3-master180.jpg
http://static01.nyt.com/images/2015/03/27/sports/Y-JACKSON/Y-JACKSON-master675.jpg
http://static01.nyt.com/images/2015/03/27/sports/LOVE/LOVE-master675.jpg
http://static01.nyt.com/images/2015/03/26/sports/CITY-KNICKS/CITY-KNICKS-blog427.jpg

我想只获取每篇文章的照片,这是我尝试过的:

foreach (DataRow row in ds.Tables[0].Rows)
{
    try
    {
        string element = row["type"].ToString();
        article = new Article();
        feed += element + " title: " + row["title"] + Environment.NewLine;
        feed += element + " title: " + row["description"] + Environment.NewLine;
        article.setTitle("<![CDATA[" + row["title"].ToString() + "]]>");
        foreach (DataRow row1 in ds.Tables[3].Rows)
        {
            feed += row1["photoURL"] + Environment.NewLine;                                       
        }
        listArt.Add(article);

        i++;
    }
    catch (IndexOutOfRangeException ioe)
    {
        feed += "Error al crear " + sourceXML + Environment.NewLine;
        feed += ioe.ToString() + Environment.NewLine;
    }
}
textBox1.AppendText(feed);

1 个答案:

答案 0 :(得分:0)

var q = from article in XDocument.Load(path)
                                 .Root
                                 .Elements("article")
        let photos = article.Element("content")
                            .Element("photos")
                            .Elements("photo")
                            .Elements("photoURL")
        select new
        {
            ArticleId = (string)article.Element("id"),
            Photos = photos.Select(e => (string)e).ToArray()
        };

用法:

foreach (var a in q)
{
    Console.WriteLine(a.ArticleId);
    foreach (var p in a.Photos)
    {
        Console.WriteLine(p);
    }
}