Question

我正在尝试在评论之间获取HTML片段。

我需要在之后的开始/结束之间解析HTML。

我实际上正在读取一个html文件，但出于测试目的，我嘲笑了以下内容：

        string emailFeedTxtStart = "<!--FEED FOR RECEIPT GOES HERE-->";
        string emailFeedTxtEnd = "<!--FEED FOR RECEIPT ENDS HERE-->";

        string html =
            emailFeedTxtStart + Environment.NewLine +
            @"<td align=""center"">" + Environment.NewLine +
            @"<table style=""table-layout:fixed;width:380px"" border=""0"" cellspacing=""0""             cellpadding=""0"">" + Environment.NewLine +
            "<tbody>" + Environment.NewLine +
            "<tr>" + Environment.NewLine +
            "<td>" + Environment.NewLine +
            "</td>" + Environment.NewLine +
            "</tr>" + Environment.NewLine +
            "</tbody>" + Environment.NewLine +
            "</table>" + Environment.NewLine +
            "</td>"  + Environment.NewLine +
            emailFeedTxtEnd; 

        string patternstart = Regex.Escape(emailFeedTxtStart);
        string patternend = Regex.Escape(emailFeedTxtEnd);
        string regexexpr = patternstart + @"(.*?)" + patternend;
        //string regexexpr = @"(?<=" + patternstart + ")(.*?)(?=" + patternend + ")";

        MatchCollection matches = Regex.Matches(@html, @regexexpr);

返回的

匹配为0。

（请注意，之间有更多HTML。）

非常感谢任何帮助。

Answer 1

你打算用什么来解析HTML？因为可能有一种方法可以放弃实际操作HTML字符串。无论如何，这是一个解决方案：

    string afterFirst = html.Substring(Regex.Match(html, emailFeedTxtStart).Index + emailFeedTxtStart.Length);
    string between = afterFirst.Substring(0, Regex.Match(afterFirst, emailFeedTxtEnd).Index);

正则表达式在两个注释之间获取html

1 个答案: