Question

我在下面有一个简单的正则表达式来拉出一个被结尾**结尾的字符串中的值，例如下面的例子。然而，虽然它很愚蠢但我正在努力获得我需要的结果！有什么明显的东西我不见了！非常感谢。

var str = "endhelloend";
var match = Regex.Match(str, @"end([a-z]+)end$", RegexOptions.IgnoreCase);

if(match.Success)
{
    result = match.Groups[0].Value  // should return 'hello'
}

Answer 1

您的模式正确包含您要提取的组。正则表达式匹配将包含您要访问的组的集合。在您的示例中，请尝试以下操作：

var str = "endhelloend";
var match = Regex.Match(str, @"end([a-z]+)end$", RegexOptions.IgnoreCase);

if(match.Success)
{
    var hello = match.Groups[1];
}

match.Groups [0]将返回整个匹配“endhelloend”，因此您只需要匹配中的第一组。

Answer 2

match.Groups [0]将匹配整个正则表达式 - 查看match.Groups [1]。

Answer 3

我认为这一行应该如下所示： result = match.Groups[1].Value;

Answer 4

我看到你正在努力解决这个问题，所以我会提供一些见解。

此正则表达式end([a-z]+)end$将匹配此字符串“endhelloend” 内部文本将位于捕获组1中当它的子串如此时，它将不匹配相同的字符串 “endhelloend of the world”。

原因是你有一个字符串metachar（断言）$的结尾作为正则表达式的一部分
就在'结束'之后。

所以你可以在正则表达式中取出$，它应该可以正常工作还有其他事情需要考虑。我会在你的正则表达中评论它。

end        // find a literal 'end'
(          // Capture group 1 open
  [a-z]+   // Find as many characters a-z as possible (including 'e' 'n' 'd' ins sequence
)          // Capture group 1 close
end        // find a literal 'end'
$          // End of string assertion (the last 'end' must be the last word in the string)

Answer 5

使用解决方案1 提取.html文字内容，然后使用解决方案2 从文本中过滤所需的文字。

要清除.htm文件中的html元素，请尝试以下操作：

string CleanXml(string DirtyXml)
{
    //string clean = ""; 
    int startloc = 0, endloc = 0;

    for (int x = 0; x <= DirtyXml.Length-1; x++)
    {
        if (DirtyXml[x] == '<')
        {
            startloc = x;
            x++;
        }
        if (DirtyXml[x] == '>')
        {
            endloc = x;
            x++;
            DirtyXml = DirtyXml.Remove(startloc, (endloc - startloc)+1);
            x = 0;
        }   
    }
    return DirtyXml;
}

正则表达式过滤文本“endhelloend”获取“hello” enter image description here

    string result = "";
    var str = "endhelloend";
    var match = Regex.Match(str, @"end([a-z]+)end$", RegexOptions.IgnoreCase);
    if(match.Success)
    {
        result = match.Groups[1].Value;  // Returns 'hello'
    }
    Console.WriteLine(result);
    Console.ReadLine();

Answer 6

试试这个，它会为你提供字词结束之间的任何字母字符，但不会捕获实际字 end

(?<=end)[a-z]+?(?=end)

简单的新秀正则表达式需要帮助

6 个答案: