我有一个Xml文件,我要在其中解析'title','id'和'description'(在属性下)元素并想要写入CSV文件
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<feed xml:base="http://google.com/en-US/syndicate/" xmlns:d="http://schemas.google.com/ado/2007/08/dataservices" xmlns:m="http://schemas.giooglt.com/ado/2007/08/dataservices/metadata" xmlns="http://www.w3.org/2005/Atom">
<title type="text">Partners</title>
<id>http://googlre.com/en-US/syndicate/Partners</id>
<updated>2014-01-16T21:33:20Z</updated>
<link rel="self" title="Partners" href="Partners" />
<entry>
<id>http://pinpoint.microsoft.com/en-US/syndicate/Partners('4555')</id>
<title type="text">M55p; Co</title>
<summary type="text">
cccc is a Certified Partner, reseller, and implementer of
Key industries we work with include:
• Financial services
• Professional services
• Media / publishing
By focusing on mid-market to enterprise clients,
</summary>
<published>2009-07-21T14:23:50-07:00</published>
<updated>2013-11-22T15:00:46-08:00</updated>
<author>
<name>google chrome</name>
<uri>http://google.com/</uri>
<email>retee@gmail.com</email>
</author>
<link rel="edit" title="Partner" href="Partners('4255')" />
<link rel="http://schemas.microsoft.com/ado/2007/08/dataservices/related/Links" type="application/atom+xml;type=feed" title="Links" href="Partners('4559')/Links">
<m:inline>
<feed>
<title type="text">Links</title>
<id>http://google.com/('429')/Links</id>
<updated>2014-01-16T21:33:20Z</updated>
<link rel="self" title="Links" href="Partners('4ff')/Links" />
<entry>
<id>http://ryryr.com/en-US/syndicate/Links('ufufr')</id>
<title type="text">
</title>
<updated>2014-01-16T21:33:20Z</updated>
<author>
<name />
</author>
<link rel="edit" title="Link" href="Links('partnerpage')" />
<category term="google.Commerce.ferrr.Syndicate.V2010_05.Link" sch="" eme="http://schemas.frrr.com/ado/2007/08/dataservices/scheme" />
<content type="application/xml">
<m:properties>
<d:Type>pgooglrpartnerpage</d:Type>
<d:Description>google Partner Page</d:Description>
<d:Url>http://googlgt.com/en-US/PartnerDetails.aspx?PartnerId=42555&wt.mc_id=66ttet</d:Url>
</m:properties>
</content>
</entry>
<entry>
<id>http://googlet.com/en-US/syndicate/Links('tpartnerrfipage')</id>
<title type="text">
</title>
<updated>2014-01-19T04:01:49Z</updated>
<author>
<name />
</author>
<link rel="edit" title="Link" href="Links('pinpointpartnerrfipage')" />
<category term="google.Commerce.Marketplace.Syndicate.V2010_05.Link" scheme="http://schemas.google.com/ado/2007/08/dataservices/scheme" />
<content type="application/xml">
<m:properties>
<d:Type>tpartnerrfipage</d:Type>
<d:Description>RFI Page</d:Description>
<d:Url>http://pinpoint.microsoft.com/en-US/RFI.aspx?partnerId=4295719419&wt.mc_id=54545</d:Url>
</m:properties>
</content>
</entry>
</feed>
</m:inline>
</link>
</entry>
<entry>
<id>http://pinpoint.microsoft.com/en-US/syndicate/Partners('45')</id>
<title type="text">vfere</title>
<summary type="text">
cccc is a Certified Partner, reseller, and implementer of
Key industries we work with include:
• Financial services
• Professional services
• Media / publishing
By focusing on mid-market to enterprise clients,
</summary>
<published>2009-07-21T14:23:50-07:00</published>
<updated>2013-11-22T15:00:46-08:00</updated>
<author>
<name>google chrome</name>
<uri>http://google.com/</uri>
<email>retee@gmail.com</email>
</author>
<link rel="edit" title="Partner" href="Partners('4255')" />
<link rel="http://schemas.microsoft.com/ado/2007/08/dataservices/related/Links" type="application/atom+xml;type=feed" title="Links" href="Partners('4559')/Links" >
<m:inline>
<feed>
<title type="text">Links</title>
<id>http://google.com/('429')/Links</id>
<updated>2014-01-16T21:33:20Z</updated>
<link rel="self" title="Links" href="Partners('4ff')/Links" />
<entry>
<id>http://ryryr.com/en-US/syndicate/Links('ufufr')</id>
<title type="text">
</title>
<updated>2014-01-16T21:33:20Z</updated>
<author>
<name />
</author>
<link rel="edit" title="Link" href="Links('partnerpage')" />
<category term="google.Commerce.ferrr.Syndicate.V2010_05.Link" scheme="http://schemas.frrr.com/ado/2007/08/dataservices/scheme" />
<content type="application/xml">
<m:properties>
<d:Type>pgooglrpartnerpage</d:Type>
<d:Description>google Partner Page</d:Description>
<d:Url>http://googlgt.com/en-US/PartnerDetails.aspx?PartnerId=42555&wt.mc_id=66ttet</d:Url>
</m:properties>
</content>
</entry>
<entry>
<id>http://googlet.com/en-US/syndicate/Links('tpartnerrfipage')</id>
<title type="text">
</title>
<updated>2014-01-19T04:01:49Z</updated>
<author>
<name />
</author>
<link rel="edit" title="Link" href="Links('pinpointpartnerrfipage')" />
<category term="google.Commerce.Marketplace.Syndicate.V2010_05.Link" scheme="http://schemas.google.com/ado/2007/08/dataservices/scheme" />
<content type="application/xml">
<m:properties>
<d:Type>tpartnerrfipage</d:Type>
<d:Description>RFI Page</d:Description>
<d:Url>http://pinpoint.microsoft.com/en-US/RFI.aspx?partnerId=4295719419&wt.m</d:Url>
</m:properties>
</content>
</entry>
</feed>
</m:inline>
</link>
</entry>
</feed>
我想将'/ entry / title','/ entry / id'和'/ de:entry / link / m:inline / feed / entry / content / m:properties / Url'分组并写入csv文件。我可以解析它们但不能将它们组合在一起。
M55p; Co,http://pinpoint.microsoft.com/en-US/syndicate/Partners('4555'),http://googlgt.com/en-US/PartnerDetails.aspx?PartnerId=42555&wt.mc_id=66ttet
M55p; Co,http://pinpoint.microsoft.com/en-US/syndicate/Partners('4555'),http://pinpoint.microsoft.com/en-US/RFI.aspx?partnerId=4295719419&wt.mc_id=54545
vfere,http://pinpoint.microsoft.com/en-US/syndicate/Partners('45'),http://googlgt.com/en-US/PartnerDetails.aspx?PartnerId=42555&wt.mc_id=66ttet
vfere,http://pinpoint.microsoft.com/en-US/syndicate/Partners('45'),http: //pinpoint.microsoft.com/en-US/RFI.aspx?partnerId=4295719419&wt.m
到目前为止,我的代码是
// Alternate Method for getting the Fields from the XML file
XmlDocument xmlDocument = new XmlDocument();
xmlDocument.Load("C:/Users/Administrator/Downloads/direct.xml");
XmlNamespaceManager xmlnm = new XmlNamespaceManager(xmlDocument.NameTable);
xmlnm.AddNamespace("de","http://www.w3.org/2005/Atom");
xmlnm.AddNamespace("m", "http://schemas.microsoft.com/ado/2007/08/dataservices/metadata");
xmlnm.AddNamespace("d", "http://schemas.microsoft.com/ado/2007/08/dataservices");
ParseXML(xmlDocument, xmlnm);
Debug.WriteLine("\n---XML parsed---");
string xmlFileName = "C:/Users/Administrator/Downloads/direct.xml";
XDocument customers = XDocument.Load(xmlFileName);
var queryResult = from c in customers.Descendants("entry").Attributes() select c.Name;
foreach (var item in queryResult)
{
Debug.WriteLine(item);
}
}
public static void ParseXML(XmlDocument xmlFile, XmlNamespaceManager xmlnm)
{
List<string> id = new List<string>();
List<string> title = new List<String>();
List<String> city = new List<String>();
String path = "C:/Users/Administrator/Downloads/data.csv";
var w = new StreamWriter(path);
//XmlNodeList nodes = xmlFile.SelectNodes("//ns:entry/ns:updated| //ns:entry/ns:published | //ns:entry/ns:id ", xmlnm);
XmlNodeList nodes = xmlFile.SelectNodes("//de:entry/de:title | //de:entry/de:link/m:inline/de:feed/de:id | //de:entry/de:link/m:inline/de:feed/de:entry/de:content/m:properties/d:Url | //de:entry/de:link/m:inline/de:feed/de:entry/de:content/m:properties/d:City | //de:entry/de:link/m:inline/de:feed/de:entry/de:content/m:properties/d:State | //de:entry/de:link/m:inline/de:feed/de:entry/de:content/m:properties/d:Country ", xmlnm);
XmlNodeList nodes1 = xmlFile.SelectNodes("//de:entry", xmlnm);
var line1 = string.Format("Field" + "," + "Data");
w.WriteLine(line1);
w.Flush();
Debug.WriteLine(nodes1.Count);
foreach (XmlNode node in nodes)
{
Debug.WriteLine(node.Name + " = " + node.InnerXml);
var line = string.Format(node.Name + "," + node.InnerText);
w.WriteLine(line);
w.Flush();
}
foreach (XmlNode node in nodes1)
{
string titl = node["title"].InnerText;
string ide = node["id"].InnerText;
Debug.WriteLine("Data :" + titl + "ID :" + ide);
}
}
我可以将'title'和'id'组合在一起,但不能在示例中指定的'属性'下一次读取'id'。新手程序员,c#的新手。任何帮助都非常感谢。
答案 0 :(得分:0)
以下是使用XDocument
代替XmlDocument
的一些代码,它会根据您的要求返回强类型结果:
using (var reader = new StreamReader(@"C:/Users/Administrator/Downloads/direct.xml"))
{
var xmlDoc = XDocument.Load(reader);
XNamespace atom = "http://www.w3.org/2005/Atom";
XNamespace metadata = "http://schemas.giooglt.com/ado/2007/08/dataservices/metadata";
XNamespace dataservices = "http://schemas.google.com/ado/2007/08/dataservices";
var result = xmlDoc.Root.Elements(atom + "entry")
.Select(e => new {
Title = e.Element(atom + "title").Value,
Id = e.Element(atom + "id").Value,
Urls = e.Elements(atom + "link")
.Where(l => l.Element(metadata + "inline") != null)
.SelectMany(l => l.Element(metadata + "inline")
.Element(atom + "feed")
.Elements(atom + "entry")
.Select(e1 => e1.Element(atom + "content")
.Element(metadata + "properties")
.Element(dataservices + "Url").Value))
});
foreach (var entry in result)
{
foreach (var url in entry.Urls)
{
Console.WriteLine("{0},{1},{2}", entry.Title, entry.Id, url);
}
}
}
这是它返回的匿名类的结构:
class Result
{
public string Title { get; set; }
public string Id { get; set; }
public IEnumerable<string> Urls { get; set; }
}
以下是您的样本的结果:
Title = "M55p; Co"
Id = "http://pinpoint.microsoft.com/en-US/syndicate/Partners('4555')"
Urls =
{
"http://googlgt.com/en-US/PartnerDetails.aspx?PartnerId=42555&wt.mc_id=66ttet"
"http://pinpoint.microsoft.com/en-US/RFI.aspx?partnerId=4295719419&wt.mc_id=54545"
}
Title = "vfere"
Id = "http://pinpoint.microsoft.com/en-US/syndicate/Partners('45')"
Urls =
{
"http://googlgt.com/en-US/PartnerDetails.aspx?PartnerId=42555&wt.mc_id=66ttet"
"http://pinpoint.microsoft.com/en-US/RFI.aspx?partnerId=4295719419&wt.m "
}
我不太确定您需要的输出CSV格式,但我想您应该可以从此结果中自行创建。
编辑:为方便起见,我在控制台中添加了一个简单的转储,类似于问题的示例输出。