Question

我有一个像这样的xml文档：

<Node1 attrib1="abc">
    <node1_1>
         <node1_1_1 attrib2 = "xyz" />
    </ node1_1>
</Node1>

<Node2 />

这里<node2 />是我要删除的节点，因为它没有子元素/元素，也没有任何属性。

Answer 1

使用XPath表达式可以找到没有属性或子节点的所有节点。然后可以从xml中删除它们。正如Sani指出的那样，您可能必须递归执行此操作，因为如果删除其内部节点，node_1_1将变为空。

var xmlDocument = new XmlDocument();
xmlDocument.LoadXml(
@"<Node1 attrib1=""abc"">
        <node1_1>
             <node1_1_1 />
        </node1_1>
    </Node1>
    ");

// select all nodes without attributes and without children
var nodes = xmlDocument.SelectNodes("//*[count(@*) = 0 and count(child::*) = 0]");

Console.WriteLine("Found {0} empty nodes", nodes.Count);

// now remove matched nodes from their parent
foreach(XmlNode node in nodes)
    node.ParentNode.RemoveChild(node);

Console.WriteLine(xmlDocument.OuterXml);
Console.ReadLine();

Answer 2

这样的事情应该这样做：

XmlNodeList nodes = xmlDocument.GetElementsByTagName("Node1");

foreach(XmlNode node in nodes)
{
    if(node.ChildNodes.Count == 0)
         node.RemoveAll;
    else
    {
        foreach (XmlNode n in node)
        {
            if(n.InnerText==String.Empty && n.Attributes.Count == 0)
            {
                n.RemoveAll;

            }
        }
    }
}

Answer 3

此样式表使用标识转换，其中空模板匹配没有节点或属性的元素，这将阻止它们被复制到输出中：

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

    <!--Identity transform copies all items by default -->
    <xsl:template match="@* | node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <!--Empty template to match on elements without attributes or child nodes to prevent it from being copied to output -->
    <xsl:template match="*[not(child::node() | @*)]"/>

</xsl:stylesheet>

Answer 4

要对所有空子节点执行此操作，请使用for循环（不是foreach）并按相反顺序。我把它解决为：

var xmlDocument = new XmlDocument();
xmlDocument.LoadXml(@"<node1 attrib1=""abc"">
                         <node1_1>
                            <node1_1_1 />
                         </node1_1>
                         <node1_2 />
                         <node1_3 />
                      </node1>
                      <node2 />
");
RemoveEmptyNodes(xmlDocument );

private static bool RemoveEmptyNodes(XmlNode node)
{
    if (node.HasChildNodes)
    {
        for(int I = node.ChildNodes.Count-1;I >= 0;I--)
            if (RemoveEmptyNodes(node.ChildNodes[I]))
                node.RemoveChild(node.ChildNodes[I]);
    }
    return 
        (node.Attributes == null || 
            node.Attributes.Count == 0) && 
        node.InnerText.Trim() == string.Empty;
}

递归调用（与其他解决方案类似）消除了xPath方法的重复文档处理。更重要的是，代码更易读，更容易编辑。双赢。

因此，此解决方案将删除<node2>，但也会正确删除<node1_2>和<node1_3>。

更新：使用以下Linq实施，发现性能显着提升。

string myXml = @"<node1 attrib1=""abc"">
                         <node1_1>
                            <node1_1_1 />
                         </node1_1>
                         <node1_2 />
                         <node1_3 />
                      </node1>
                      <node2 />
");
XElement xElem = XElement.Parse(myXml);
RemoveEmptyNodes2(xElem);

private static void RemoveEmptyNodes2(XElement elem)
{
    int cntElems = elem.Descendants().Count();
    int cntPrev;
    do
    {
        cntPrev = cntElems;
        elem.Descendants()
            .Where(e => 
                string.IsNullOrEmpty(e.Value.Trim()) && 
                !e.HasAttributes).Remove();
        cntElems = elem.Descendants().Count();
    } while (cntPrev != cntElems);
}

循环处理需要删除父项的情况，因为它的唯一子项已被删除。由于幕后的XContainer实现，使用IEnumerable或派生词往往会有类似的性能提升。这是我最喜欢的事情。

在任意68MB xml文件上RemoveEmptyNodes往往需要大约90秒，而RemoveEmptyNodes2往往需要大约1秒。

XML：如何删除没有属性和子元素的所有节点

4 个答案: