Question

我正在研究文本摘要方法，为了测试我的方法我有一个名为xml的基准测试，在这个基准测试中我有很多xml文件，我应该清除该文件。

例如，我有一个<sentence id='s0'> The nature of the proceeding 1 The principal issue in this proceeding is whether the Victorian Arts Centre falls within the category of 'premises of State Government Departments and Instrumentalities', for the purposes of provisions in industrial awards relating to rates of payment for persons employed in cleaning those premises.</sentence> <sentence id='s1'>In turn, this depends upon whether the Victorian Arts Centre Trust, a statutory corporation established by the Victorian Arts Centre Act 1979 (Vic) ('the VAC Act'), is properly described as a State Government department or instrumentality, for the purposes of the award provisions.</sentence> ;文件，如下所示：

<sentence id='s0'></sentence>

我应该在<sentence id='s1'></sentence>和The nature of the proceeding 1 The principal issue in this proceeding is whether the Victorian Arts Centre falls within the category of 'premises of State Government Departments and Instrumentalities', for the purposes of provisions in industrial awards relating to rates of payment for persons employed in cleaning those premises. In turn, this depends upon whether the Victorian Arts Centre Trust, a statutory corporation established by the Victorian Arts Centre Act 1979 (Vic) ('the VAC Act'), is properly described as a State Government department or instrumentality, for the purposes of the award provisions.之间提取字符串我的意思是结果应该是这样的：

Regex.Match("User name (sales)", @"\(([^)]*)\)").Groups[1].Value

我发现了这样的事情：

Regex

使用{{1}}，但它不起作用。请您快速解决一下这个问题吗？

Answer 1

使用LINQ to XML应该更容易：

Collection open properties are not supported in this release.

<或者，正如耶尔达尔所建议的那样，更清洁的方式是：

var res = XElement.Parse(xml)
                  .Descendants("sentence").Where(e => e.Attribute("id").Value == "s0")
                  .FirstOrDefault().Value;

Answer 2

XElment.Parse仅在具有单根节点的String中使用。您编写的实例有两个节点''没有一个根节点。您可以添加如下的根节点：

xml = "<root>" + xml + "</root>";

获取＆lt;＆gt;之间的值里面有动态数字

2 个答案: