如何在VB.NET中提取邮件正文中的img标签

时间:2010-06-11 09:44:35

标签: vb.net

我将邮件内容(邮件正文)存储在数据库中 我想从那些邮件内容中提取所有图像标记()的“src”属性的值 邮件正文中可能包含一个或多个图像。

请告诉我如何在VB.NET中实现这一点?
感谢。

1 个答案:

答案 0 :(得分:6)

您可以使用正则表达式

Try
    Dim RegexObj As New Regex("<img[^>]+src=[""']([^""']+)[""']", RegexOptions.Singleline Or RegexOptions.IgnoreCase)
    Dim MatchResults As Match = RegexObj.Match(SubjectString)
    While MatchResults.Success
        ' SRC attribute is in MatchResults.Groups(1).Value
        MatchResults = MatchResults.NextMatch()
    End While
Catch ex As ArgumentException
    'Syntax error in the regular expression (which there isn't)
End Try

以下是它的工作原理:

<img[^>]+src=["']([^"']+)["']

Match the characters "<img" literally «<img»
Match any character that is not a ">" «[^>]+»
   Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the characters "src=" literally «src=»
Match a single character present in the list ""'" «["']»
Match the regular expression below and capture its match into backreference number 1 «([^"']+)»
   Match a single character NOT present in the list ""'" «[^"']+»
      Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match a single character present in the list ""'" «["']»