C#中的正则表达式条件问题

时间:2011-06-11 04:09:02

标签: c# xml regex xpath xml-parsing

我有一些嵌套在span标签中的内容。其中一些有我需要拉的细节,有些则没有。我无法弄清楚如何检查两个选项并提取正确的数据。这些群体重复。例如:

<span name="foo">
    <span name="bar">
        Missing Data
    </span>
</span>
<span name="foo">
    <span name="bar">
        <span name="detail1">first detail</span>
        <span name="detail2">second detail</span>
    </span>
</span>

我必须单独捕获细节,如果它们在那里,否则我需要在循环通过matchcollection时在程序中的字符串中将这些值设置为null,这样我的代码需要将strDetail1和strDetail2设置为“”或值如果有意义的话,“第一个细节”和“第二个细节”。

1 个答案:

答案 0 :(得分:2)

我建议使用XPath来解析值。对于解析xml结构,这将比Regex更可靠。

var xml = @"
    <root>
    <span name=""foo"">
        <span name=""bar"">
            Missing Data
        </span>
    </span>
    <span name=""foo"">
        <span name=""bar"">
            <span name=""detail1"">first detail</span>
            <span name=""detail2"">second detail</span>
        </span>
    </span>
    </root>
";

var document = XDocument.Parse(xml);
var details = document.XPathSelectElements("//span[@name='foo']/span[@name='bar']/span[starts-with(@name,'detail')]")
    .Select(arg => arg.Value)
    .ToList();

或LINQ-to-XML

var details = document
    .Descendants("span").Where(arg => arg.Attribute("name").Value == "foo")
    .Elements("span").Where(arg => arg.Attribute("name").Value == "bar")
    .Elements("span").Where(arg => arg.Attribute("name").Value.StartsWith("detail"))
    .Select(arg => arg.Value)
    .ToList();

[编辑] 我可能会误解这个问题。好像你也想要替换或填充一些值。只要您有XDocument,就可以使用上述方法执行此操作。例如,此代码将清除detail1detail2元素的值:

var detailNodes = document.XPathSelectElements("//span[@name='foo']/span[@name='bar']/span[starts-with(@name,'detail')]")
    .ToList();

detailNodes[0].Value = string.Empty;
detailNodes[1].Value = string.Empty;

var newXml = document.ToString();

<强> [编辑]

如何添加元素:

var elementsWithMissingDetals = document
    .XPathSelectElements("//span[@name='foo']/span[@name='bar' and count(*)=0]")
    .ToList();

foreach (var elementsWithMissingDetal in elementsWithMissingDetals)
{
    elementsWithMissingDetal.Add(
        new XElement("span", "first detail", new XAttribute("name", "detail1")));
    elementsWithMissingDetal.Add(
        new XElement("span", "second detail", new XAttribute("name", "detail2")));
}

var newXml = document.ToString();