我有一些XML,我坚持使用。
我不确定如何从这种格式中检索。我考虑生成一个整数并根据字符串连接执行循环,但我希望有人做了类似的事情并找到了更聪明的解决方案。
XML
<TRANSACTION>
<!-- Not ideal, but fairly straight forward. -->
<itema></itema>
<itemb></itemb>
<itemtypea></itemtypea>
<itemtypeb></itemtypeb>
<itemid></itemid>
<itemlabeltypea></itemlabeltypea>
<itemlabeltypeb></itemlabeltypeb>
<savenewitema></savenewitema>
<savenewitemb></savenewitemb>
<!-- One to Many Inserts: Insert0, Insert1, etc. -->
<Insert0></Insert0>
<InsertinItem0></InsertinItem0>
<!-- One to Many Deletes: Again, seriously? -->
<Delete0></Delete0>
<DeletefromItem0></DeletefromItem0>
<!-- One to Many Updates: Why? -->
<Update0></Update0>
<UpdateinItem0></UpdateinItem0>
</TRANSACTION>
Linq to XML
// Create data object from XML.
var data = (from item in xmlDoc.Descendants("TRANSACTION")
select new
{
// Pseudo code, this will undesirably retrieve Delete0 and DeletefromItem0 as separate records.
// Perhaps a join is necessary and I need to filter out DeletefromItem0 from the left hand table?
// Are there any obvious solutions I may have missed?
DeleteFromItems = from e in item.Elements().Where(x => x.Name.LocalName.StartsWith("Delete"))
select new
{
ItemId = default(int), // Would ideally contain DeletefromItem0.
UniqueId = e.Value
},
InsertIntoItems = from e in item.Elements().Where(x => x.Name.LocalName.StartsWith("Insert"))
select new
{
ItemId = default(int),
UniqueId = e.Value
},
ItemId = item.Element("itemid").Value,
PrimaryItem = new
{
Id = Int32.Parse(item.Element("itema").Value),
IsNew = Boolean.Parse(item.Element("savenewitema").Value),
LabelType = item.Element("itemlabeltypea").Value,
Type = item.Element("itemtypea").Value
},
SecondaryItem = new
{
Id = Int32.Parse(item.Element("itemb").Value),
IsNew = Boolean.Parse(item.Element("savenewitemb").Value),
LabelType = item.Element("itemlabeltypeb").Value,
Type = item.Element("itemtypeb").Value
}
}).First();
答案 0 :(得分:1)
在尝试使用文档之前,您应该自己做一件大事并清理文档。您可以在XSLT中执行此操作,但您可能会遇到困难。幸运的是,使用优质的LINQ并不是最糟糕的。
虽然不是绝对必要,但最好跟踪任何已清理过的元素,以确保它们不会被多次处理。
public static class XmlSanitizer
{
static XNamespace NS => "urn:example:sanitizer";
internal static XName IndexName => NS + "Index";
internal static XName SanitizedName => NS + "Sanitized";
public static void Sanitize(XDocument doc, params string[] patterns)
{
if (!HasSanitzerNamespace(doc))
doc.Root.Add(new XAttribute(XNamespace.Xmlns + "s", NS.NamespaceName));
foreach (var pattern in patterns)
{
var nodes =
(from e in doc.Root.Elements()
let m = Regex.Match(e.Name.LocalName, pattern)
where m.Success
let sanitized = (bool?)e.Attribute(SanitizedName)
where !(sanitized ?? false)
select new
{
Element = e,
Namespace = e.Name.Namespace,
LocalName = m.Groups[1].Value,
Index = m.Groups[2].Value,
}).ToList();
foreach (var x in nodes)
{
// it might be preferrable to place the new elements within a grouping element
x.Element.ReplaceWith(
new XElement(x.Namespace + x.LocalName,
new XAttribute(IndexName, x.Index),
new XAttribute(SanitizedName, true),
x.Element.Attributes(),
x.Element.Nodes()
)
);
}
}
}
static bool HasSanitzerNamespace(XDocument doc) =>
(from a in doc.Root.Attributes()
where a.Name.Namespace == XNamespace.Xmlns
where (string)a == NS.NamespaceName
select a).Any();
}
public static class XmlStanitizerExtensions
{
static XName IndexName => XmlSanitizer.IndexName;
public static XElement ElementIndex(this XElement e, XName name, string index) => e.Elements(name).Where(n => (string)n.Attribute(IndexName) == index).Single();
}
然后进行清理,将名称的正则表达式传递给组
XmlSanitizer.Sanitize(doc, new string[]
{
@"(item)([ab])",
@"(itemtype)([ab])",
@"(itemlabeltype)([ab])",
@"(savenewitem)([ab])",
@"(Insert)(\d+)",
@"(InsertinItem)(\d+)",
@"(Delete)(\d+)",
@"(DeletefromItem)(\d+)",
@"(Update)(\d+)",
@"(UpdateinItem)(\d+)",
});
这会给你这样的东西:
<TRANSACTION xmlns:s="urn:example:sanitizer">
<!-- Not ideal, but fairly straight forward. -->
<item s:Index="a" s:Sanitized="true" />
<item s:Index="b" s:Sanitized="true" />
<itemtype s:Index="a" s:Sanitized="true" />
<itemtype s:Index="b" s:Sanitized="true" />
<itemid></itemid>
<itemlabeltype s:Index="a" s:Sanitized="true" />
<itemlabeltype s:Index="b" s:Sanitized="true" />
<item s:Index="a" s:Sanitized="true" />
<item s:Index="b" s:Sanitized="true" />
<!-- One to Many Inserts: Insert0, Insert1, etc. -->
<Insert s:Index="0" s:Sanitized="true" />
<InsertinItem s:Index="0" s:Sanitized="true" />
<!-- One to Many Deletes: Again, seriously? -->
<Delete s:Index="0" s:Sanitized="true" />
<DeletefromItem s:Index="0" s:Sanitized="true" />
<!-- One to Many Updates: Why? -->
<Update s:Index="0" s:Sanitized="true" />
<UpdateinItem s:Index="0" s:Sanitized="true" />
</TRANSACTION>
至少有了这个,处理会更容易。
var data =
(from t in doc.Elements("TRANSACTION")
select new
{
// assuming the indices are sequential
DeleteFromItems = t.Elements("Delete").Zip(t.Elements("DeletefromItem"), (d, dfi) => new
{
ItemId = (int)dfi, // assuming there's a value
UniqueId = (string)d,
}).ToList(),
InsertIntoItems = t.Elements("Insert").Zip(t.Elements("InsertinItem"), (i, iii) => new
{
ItemId = (int)iii, // assuming there's a value
UniqueId = (string)i,
}).ToList(),
UpdateIntoItems = t.Elements("Update").Zip(t.Elements("UpdateinItem"), (u, uii) => new
{
ItemId = (int)uii, // assuming there's a value
UniqueId = (string)u,
}).ToList(),
ItemId = (string)t.Element("itemid"),
PrimaryItem = new
{
Id = (int)t.ElementIndex("item", "a"),
IsNew = (bool)t.ElementIndex("savenewitem", "a"),
LabelType = (string)t.ElementIndex("itemlabeltype", "a"),
Type = (string)t.ElementIndex("itemtype", "a"),
},
SecondaryItem = new
{
Id = (int)t.ElementIndex("item", "b"),
IsNew = (bool)t.ElementIndex("savenewitem", "b"),
LabelType = (string)t.ElementIndex("itemlabeltype", "b"),
Type = (string)t.ElementIndex("itemtype", "b"),
},
}).Single();
此外,我会在卫生阶段对相应的元素进行分组,以使处理更加容易。那么你就不必对数据做出很多假设。我将此作为学习练习留给你。
答案 1 :(得分:0)
如果我正确理解您的问题,这种扩展方法可能会使这种数据更容易处理:
public static IEnumerable<XElement> EnumerateGroup(this XElement source, string groupName)
{
return source.Elements()
.Where(element => Regex.IsMatch(element.Name.LocalName, "^" + groupName + "[a-z0-9]*$"));
}
用作:
XElement xml = XElement.Parse(xmlString);
var results = xml.EnumerateGroup("savenewitem"); // savenewitema, savenewitemb
该方法枚举了所有子元素,但正则表达式(如果你不熟悉它本身就是一个主题,虽然这里有很多好的资源)只会返回与组名完全匹配的那些。或者,如果最后有一个额外的字符(通过查看a,b 0等的示例 - 如果你有更大的数字,你可能需要扩展它!)。