我正在为我正在处理的项目解析电子邮件。 到目前为止,我连接到一个pop3邮件服务器,下载那里的所有邮件,并通过它循环获取发件人,主题和正文。
然后我解码了base64主体,它给我留下了一个多部分的MIME消息,就像我自己发送的以下测试邮件...
我需要能够拆分这个Multipart MIME电子邮件正文,这样我就可以拥有一个只包含邮件的纯文本版本的字符串和另一个包含html部分的字符串。
我对邮件可能拥有的任何其他东西都不感兴趣......附件等等都可能掉线。
有人能指出我正确的方向吗?
如果我打算使用第三方控件,是否有人知道任何可以执行此操作的免费软件?我永远不需要编码,只需解码。
答案 0 :(得分:1)
假设您在电子邮件中有提取的标题,以便您可以获取用于标识电子邮件中部分边界的字符串,您可以使用以下代码进行解析:
Imports System.IO
Imports System.Text.RegularExpressions
Module Module1
Sub Main()
Dim sampleEmail = File.ReadAllText("C:\temp\SampleEmail.eml")
Dim getBoundary As New Regex("boundary=(.*?)\r\n")
Dim possibleBoundary = getBoundary.Matches(sampleEmail)
Dim boundary = ""
If possibleBoundary.Count = 0 Then
Console.WriteLine("Could not find boundary specifier.")
End
End If
' the boundary string may or may not be surrounded by double-quotes
boundary = possibleBoundary(0).Groups(1).Value.Trim(CChar(""""))
Console.WriteLine(boundary)
boundary = vbCrLf & "--" & boundary
Dim parts = Regex.Split(sampleEmail, Regex.Escape(boundary))
Console.WriteLine("Number of parts: " & parts.Count.ToString())
' save the parts to one text file for inspection
Using sw As New StreamWriter("C:\temp\EmailParts.txt")
For i = 0 To parts.Count - 1
' this is where you would find the part with "Content-Type: text/plain;" -
' you may also need to look at the charset, e.g. charset="utf-8"
sw.WriteLine("PART " & i.ToString())
sw.WriteLine(parts(i))
Next
End Using
Console.ReadLine()
End Sub
End Module
我用来测试的电子邮件没有涉及任何base-64编码。
答案 1 :(得分:1)
我建议使用我的免费/开源MimeKit库来完成此任务,而不是使用正则表达式解决方案。
我真的不知道VB.NET,所以下面的代码片段可能不太正确(我是C#家伙),但它应该让你大致了解如何完成你想要的任务: / p>
Dim message = MimeMessage.Load ("C:\email.msg");
Dim html = message.HtmlBody;
Dim text = message.TextBody;
正如你所看到的,MimeKit使这类事情变得非常微不足道。
答案 2 :(得分:-1)
A = E1 = 80 =
= B8 = E1 = 80 = 80 = E1 = 80 = BC = E1 = 80-8A = E1 = 80 = BA; = 50 = 61 = 74 = 69 = 65 = 6E = 74 ;;
-PRINTABLE:= 50 = 61 = 74 = 69 = 65-6 E = 74 = 20 = E1 = 80 = 99 = E1 = 80 = 81 = E1 = 80 = 84 = E1 = 80 = BA = >
E1 = 81 = 80 =
= E1 = 80 = 84 = E1 = 80 = BA = E1 = 80 = B8 = E1 = 80 = 80 = E1 = 80 = BC = E1 = 80 = 8A = E1 = 80 = BA
B = E1 = 80 = AD = E1 = 80 = AF; = 50 = 61 = 74 = 69 = 65 =6E = 74 ;;
E1 = 80 = AF =
END:VCARD