我想读入XML文档并返回具有唯一节点的XML文档。如果存在一个带有重复元素compoundName
的节点,则应删除该父节点。
<scanSegment>
<index>28</index>
<GUID>539003de-1379-4a03-94bf-1ede58625ab5</GUID>
<ionMode>ESI</ionMode>
<ionPolarity>Positive</ionPolarity>
<scanType>DynamicMRM</scanType>
<dataStorage>PeakDetected</dataStorage>
<threshold>0</threshold>
<fragmentorMode>Fixed</fragmentorMode>
<fragmentorRamp />
<scheduledTime>4.33</scheduledTime>
<timeWindow>1.2</timeWindow>
<scheduledSetting>720</scheduledSetting>
<isTriggeredMRM>false</isTriggeredMRM>
<numtMRMRepeats>3</numtMRMRepeats>
<scanElements>
<scanElement>
<index>1</index>
<compoundName>3-keto carbofuran</compoundName>
<isISTD>false</isISTD>
<ms1LowMz>236.1</ms1LowMz>
<ms1Res>Unit</ms1Res>
<ms2LowMz>208.1</ms2LowMz>
<ms2Res>Unit</ms2Res>
<fragmentor>82</fragmentor>
<deltaEMV>200</deltaEMV>
<cellAccVoltage>9</cellAccVoltage>
<collisionEnergy>4</collisionEnergy>
<isPrimaryMRM>true</isPrimaryMRM>
<isTriggerMRM>false</isTriggerMRM>
<triggerEntranceDelayTime>0</triggerEntranceDelayTime>
<triggerDelayTime>0</triggerDelayTime>
<triggerWindow>0</triggerWindow>
<triggerMRMThreshold>0</triggerMRMThreshold>
<compoundGroup>
</compoundGroup>
</scanElement>
</scanElements>
</scanSegment>
名为“ compoundName”的元素嵌套在scanElement和scanElements中...我在过滤XML文档以检查元素“ compoundName”是否唯一时遇到了麻烦。
我已经阅读了一些具有LINQ格式的示例,例如
xmlDoc.Descendants("scanSegment").GroupBy().Where().Remove()
我不确定如何填写其余查询。
答案 0 :(得分:0)
删除这些重复元素的一种方法是将XSLT样式表应用于XML。 Sample code is described here at Microsoft。我对其进行了修改以满足您的需求。
source.xml
是输入文件,trans.xslt
是XSLT文件,destination.xml
是输出文件。
// Open books.xml as an XPathDocument.
XPathDocument doc = new XPathDocument("source.xml");
// Create a writer for writing the transformed file.
XmlWriter writer = XmlWriter.Create("destination.xml");
// Create and load the transform with script execution enabled.
XslCompiledTransform transform = new XslCompiledTransform();
XsltSettings settings = new XsltSettings();
settings.EnableScript = true;
transform.Load("trans.xslt", settings, null);
// Execute the transformation.
transform.Transform(doc, writer);
这是XSLT-1.0文件trans.xslt
。您要应用的任务在一个模板中通过表达式scanElement[count(compoundName) > 1]
完成。它将丢弃计数超过一个scanElement
个孩子的所有compoundName
个对象。
因此,从本质上讲,您可以在一行XSLT代码中完成过滤。它带有 identity template (身份模板),该模板复制没有其他模板适用的所有节点。
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl='http://www.w3.org/1999/XSL/Transform' version='1.0'>
<!-- Identity template - this template is applied by default to all nodes and attributes -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*[count(compoundName) > 1]" />
</xsl:stylesheet>
答案 1 :(得分:0)
尝试以下操作:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
namespace ConsoleApplication1
{
class Program
{
const string FILENAME = @"c:\temp\test.xml";
static void Main(string[] args)
{
XDocument doc = XDocument.Load(FILENAME);
XElement scanElements = doc.Descendants("scanElements").FirstOrDefault();
List<XElement> uniqueScanElements = scanElements.Elements("scanElement")
.Select(x => new { compoundName = (string)x.Element("compoundName"), scanElement = x })
.GroupBy(x => x.compoundName)
.Select(x => x.FirstOrDefault())
.Select(x => x.scanElement)
.ToList();
scanElements.ReplaceWith(new XElement("scanElements"), uniqueScanElements);
}
}
}