Question

我有这个字符串

http://localhost:1209/Pages/ap-aria.aspx text: <p><img alt="" src="http://localhost:1209/ckeditor/plugins/imagebrowser/browser/Hydrangeas.jpg" style="width: 1024px; height: 768px;" />qswdqwdqweqweqe</p>

现在我想得到图片标签。我不能拆分这个只返回图片标签的字符串给我？我想得到这个结果

<img alt="" src="http://localhost:1209/ckeditor/plugins/imagebrowser/browser/Hydrangeas.jpg" style="width: 1024px; height: 768px;" />

谢谢你的帮助

Answer 1

您不应该使用正则表达式（mandatory Regex-HTML link）解析HTMl。使用适当的解析器（例如HTML Agility Pack）应该可以解决问题。然后，您可以合并this和this以前的SO帖子来完成您的目标。

Answer 2

您可以使用正则表达式来提取图像标记。您需要一个非贪婪的匹配，它会在标记结束的第一次出现时停止（/>或</img>）。有关演示，请参阅this link。

将其付诸实践：

        string text = "http://localhost:1209/Pages/ap-aria.aspx text: <p><img alt=\"\" src=\"http://localhost:1209/ckeditor/plugins/imagebrowser/browser/Hydrangeas.jpg\" style=\"width: 1024px; height: 768px;\" />qswdqwdqweqweqe</p>";
        Regex regex = new Regex("(<img.+?(/>|</img>))");

        if (regex.IsMatch(text))
        {
            //the text contains an img tag.
            string imgTag = regex.Match(text).Captures[0].Value;
        }

Answer 3

如果您需要解析html，请使用HtmlAgilityPack之类的可用库：

string result = null;
var htmlDoc = new HtmlAgilityPack.HtmlDocument();
htmlDoc.LoadHtml(htmlText);
var img = htmlDoc.DocumentNode.Descendants("img").FirstOrDefault();
if(img != null)
{
    result = img.OuterHtml;
}

结果：

<img alt="" src="http://localhost:1209/ckeditor/plugins/imagebrowser/browser/Hydrangeas.jpg" style="width: 1024px; height: 768px;">

在c＃中拆分字符串中的特殊文本

3 个答案: