我有一些嵌套在span标签中的内容。其中一些有我需要拉的细节,有些则没有。我无法弄清楚如何检查两个选项并提取正确的数据。这些群体重复。例如:
<span name="foo">
<span name="bar">
Missing Data
</span>
</span>
<span name="foo">
<span name="bar">
<span name="detail1">first detail</span>
<span name="detail2">second detail</span>
</span>
</span>
我必须单独捕获细节,如果它们在那里,否则我需要在循环通过matchcollection时在程序中的字符串中将这些值设置为null,这样我的代码需要将strDetail1和strDetail2设置为“”或值如果有意义的话,“第一个细节”和“第二个细节”。
答案 0 :(得分:2)
我建议使用XPath来解析值。对于解析xml结构,这将比Regex更可靠。
var xml = @"
<root>
<span name=""foo"">
<span name=""bar"">
Missing Data
</span>
</span>
<span name=""foo"">
<span name=""bar"">
<span name=""detail1"">first detail</span>
<span name=""detail2"">second detail</span>
</span>
</span>
</root>
";
var document = XDocument.Parse(xml);
var details = document.XPathSelectElements("//span[@name='foo']/span[@name='bar']/span[starts-with(@name,'detail')]")
.Select(arg => arg.Value)
.ToList();
或LINQ-to-XML
var details = document
.Descendants("span").Where(arg => arg.Attribute("name").Value == "foo")
.Elements("span").Where(arg => arg.Attribute("name").Value == "bar")
.Elements("span").Where(arg => arg.Attribute("name").Value.StartsWith("detail"))
.Select(arg => arg.Value)
.ToList();
[编辑] 我可能会误解这个问题。好像你也想要替换或填充一些值。只要您有XDocument
,就可以使用上述方法执行此操作。例如,此代码将清除detail1
和detail2
元素的值:
var detailNodes = document.XPathSelectElements("//span[@name='foo']/span[@name='bar']/span[starts-with(@name,'detail')]")
.ToList();
detailNodes[0].Value = string.Empty;
detailNodes[1].Value = string.Empty;
var newXml = document.ToString();
<强> [编辑] 强>
如何添加元素:
var elementsWithMissingDetals = document
.XPathSelectElements("//span[@name='foo']/span[@name='bar' and count(*)=0]")
.ToList();
foreach (var elementsWithMissingDetal in elementsWithMissingDetals)
{
elementsWithMissingDetal.Add(
new XElement("span", "first detail", new XAttribute("name", "detail1")));
elementsWithMissingDetal.Add(
new XElement("span", "second detail", new XAttribute("name", "detail2")));
}
var newXml = document.ToString();