Image Tag的src使用Regex

时间:2014-02-12 07:47:02

标签: c# regex

我需要使用正则表达式找到第一个 img标记的src 到以下字符串。 怎么做?

><div dir="ltr" style="text-align: left;" trbidi="on"><div class="MsoNormal"
 style="background: white; line-height: 15.0pt; margin-bottom: .0001pt; margin-bottom: 0in; mso-outline-level: 2; vertical-align: baseline;"><div class="separator" style="clear: both; text-align: center;"><a href="http://1.bp.blogspot.com/-c-ugY7XUnYo/UoJtj0dzvKI/AAAAAAAAACA/qWtvYnP9wfc/s1600/Screen+shot+2013-11-12+at+10.03.25+AM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="257" src="http://1.bp.blogspot.com/-c-ugY7XUnYo/UoJtj0dzvKI/AAAAAAAAACA/qWtvYnP9wfc/s320/Screen+shot+2013-11-12+at+10.03.25+AM.png" width="320" /></a></div><h4><span style="background-color: transparent;">With over 150,000 pet care professionals in the United States, your ability to differentiate your business is critical to long-term sustainable growth.  By focusing on the customer experience you can gain the loyalty of prospective pet parents and continue to thrive with your current pack.</span><span style="background-color: transparent;">  </span><span style="background-color: transparent;">Below are 5 ways to differentiate your pet business so you have a leg up on your local competitors.</span></h4></div><div class="MsoNormal"><div

2 个答案:

答案 0 :(得分:4)

Don't use Regex to parse html。使用真正的html解析器,如HtmlAgilityPack

var html = WebUtility.HtmlDecode(yourtext);
var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
var urls = doc.DocumentNode.SelectNodes("//img[@src]")
              .Select(img => img.Attributes["src"].Value)
              .ToList(); 

答案 1 :(得分:3)

试试这个

<img.+?src=[\"'](.+?)[\"'].*?>

string src = Regex.Match(original_text, "<img.+?src=[\"'](.+?)[\"'].*?>", RegexOptions.IgnoreCase).Groups[1].Value;

Regex Demo