我有一些看起来像sample file
的xml文件我想从中删除无效的xref
节点,但保留这些节点的内容。
了解xref
节点是否有效的方法是检查其属性rid
的值是否与该节点中存在的任何节点的任何属性id
完全匹配。整个文件,因此上述示例的输出文件应该类似于sample output file
到目前为止我写的代码在
之下XDocument doc=XDocument.Load(@"D:\sample\sample.xml",LoadOptions.None);
var ids = from a in doc.Descendants()
where a.Attribute("id") !=null
select a.Attribute("id").Value;
var xrefs=from x in doc.Descendants("xref")
where x.Attribute("rid")!=null
select x.Attribute("rid").Value;
if (ids.Any() && xrefs.Any())
{
foreach(var xref in xrefs)
{
if (!ids.Contains(xref))
{
string content= File.ReadAllText(@"D:\sample\sample.xml");
string result=Regex.Replace(content,"<xref ref-type=\"[^\"]+\" rid=\""+xref+"\">(.*?)</xref>","$1");
File.WriteAllText(@"D:\sample\sample.xml",result);
}
}
Console.WriteLine("complete");
}
else
{
Console.WriteLine("No value found");
}
Console.ReadLine();
问题是当xref
的值包含., *, (
等字符时。在正则表达式替换中需要正确转义或替换可能会弄乱文件。
有没有人有更好的解决方案?
答案 0 :(得分:1)
您不需要正则表达式来执行此操作。而是使用element.ReplaceWith(element.Nodes())
将节点替换为其子节点。示例代码:
XDocument doc = XDocument.Load(@"D:\sample\sample.xml", LoadOptions.None);
// use HashSet, since you only use it for lookups
var ids = new HashSet<string>(from a in doc.Descendants()
where a.Attribute("id") != null
select a.Attribute("id").Value);
// select both element itself (for update), and value of "rid"
var xrefs = from x in doc.Descendants("xref")
where x.Attribute("rid") != null
select new { element = x, rid = x.Attribute("rid").Value };
if (ids.Any()) {
var toUpdate = new List<XElement>();
foreach (var xref in xrefs) {
if (!ids.Contains(xref.rid)) {
toUpdate.Add(xref.element);
}
}
if (toUpdate.Count > 0) {
foreach (var xref in toUpdate) {
// replace with contents
xref.ReplaceWith(xref.Nodes());
}
doc.Save(@"D:\sample\sample.xml");
}
}