如何从电子邮件中提取特定文本?

时间:2018-11-04 13:19:55

标签: regex split outlook outlook-vba trim

我正在努力从标准电子邮件中提取姓名和电子邮件地址。

我希望有300封电子邮件具有如下所示的相同格式/布局。

来自:webfeedback@XXXXX.com 发送:2018年11月1日星期四上午10:20 至:乔 主题:2018年TEAM证书

2018年11月1日,星期四-10:20

您希望您的名字出现在参与证书上吗?乔姓 电子邮件地址为必填项ojoelastname@XXXXXXXXX.com

我想提取名称“ Joe LastName”,电子邮件地址ojoelastname@xxxxxxxxxx.com和提交到excel中的日期。

这时,代码正在提取为excel: “您希望您的名字出现在参加证书上吗? OJoe Xaskasdad ”和电子邮件地址“ ojoeXaskasdaa@XXXXXXxXxX.org >”

我正在努力(认真)弄清楚如何仅获取名称“ oJoe Xaskasdad ”和电子邮件 ojoeXaskasdaa@XXXXXXxXxX.org >地址(减号>) ”)。

我是VBA的新手,但喜欢学习。在这一点上,我很困惑,尽管我继续阅读和研究该错误,但我希望有人会热情地提供帮助,因为时间不多了,我可能很快就要做很多复制和粘贴了。

您的建议,建议(修复)将不胜感激。 谢谢您的任何帮助!

当前代码

Sub CopyToExcel13()

    Dim xlApp As Excel.Application
    Dim xlWB As Excel.Workbook
    Dim xlSheet As Excel.Worksheet
    Dim olItem As Outlook.MailItem
    Dim vText As Variant
    Dim sText As String
    Dim vItem As Variant
    Dim i As Long
    Dim RowCount As Long
    Dim sLink As String
    Dim bXStarted As Boolean
    Dim FilePath As String
    Dim sReplace As String

    FilePath = "D:\My Documents\Book1.xlsx" 'the path of the xl workbook'


    If Application.ActiveExplorer.Selection.Count = 0 Then
        MsgBox "No Items selected!", vbCritical, "Error"
    End If

    On Error Resume Next
    Set xlApp = GetObject(, "Excel.Application")

    If Err <> 0 Then
        Application.StatusBar = "Please wait while Excel source is opened ... "
        Set xlApp = CreateObject("Excel.Application")
        bXStarted = True
    End If

    On Error GoTo 0
    '// Open the workbook to input the data
    Set xlWB = xlApp.Workbooks.Open(FilePath) ' Open xlFile
    Set xlSheet = xlWB.Sheets("Sheet1") ' use Sheet1 or Sheet name

    '// Process each selected Mail Item
    For Each olItem In Application.ActiveExplorer.Selection
        sText = olItem.Body ' Email Body
        vText = Split(sText, Chr(13)) ' Chr(13) = Carriage return
'        vPara = Split(sText, Chr(13))

        '// Find the next empty line of the worksheet
        RowCount = xlSheet.Range("A" & xlSheet.Rows.Count).End(xlUp).Row
        RowCount = RowCount + 1

        '// Check each line of text in the message body down loop
        For i = UBound(vText) To 0 Step -1

            '// InStr([start,]mainString, SearchedString[, compare])
            If InStr(1, vText(i), "name to appear") > 0 Then
                '// Split vItem : & :
                vItem = Split(vText(i), Chr(58)) ' Chr(58) = :
                '// Trim = String whose both side spaces needs to be trimmed
                xlSheet.Range("A" & RowCount) = Trim(vItem(0)) ' (0) = Position
            End If

            '// Email Address Required
            If InStr(1, vText(i), "Email Address Required ") > 0 Then
                vItem = Split(vText(i), Chr(58))
                xlSheet.Range("B" & RowCount) = Trim(vItem(1))
            End If

        Next i

        xlWB.Save

    Next olItem

    '// Save & close workbook
    xlWB.Close SaveChanges:=True
    If bXStarted Then
        xlApp.Quit
    End If

    '// Cleanup
    Set xlApp = Nothing
    Set xlWB = Nothing
    Set xlSheet = Nothing
    Set olItem = Nothing

2 个答案:

答案 0 :(得分:0)

您知道这些电子邮件的确切格式吗?例如,我检查过的每封电子邮件的正文都使用CR LF作为换行符,通常超出您的期望。如果我正确解释了您的代码,则您处理的每一行都会以换行开头。

下面是我用来调查希望处理的电子邮件的模块中的代码。

我建议您将代码复制到一个名为“ ModInvestigate”的新模块中。您将需要引用“ Microsoft脚本运行时”和“ Microsoft ActiveX数据对象n.n库”。

从这些电子邮件中选择一些,然后运行宏“ InvestigateEmails”。您会在桌面上找到一个名为“ InvestigateEmails.txt”的新文件。

您直接关注的属性是文本主体。回车,换行和制表符已被替换为“ {cr}”,“ {lf}”和“ {tb}”,因此您可以确切地看到VBA宏会看到的内容。

这可能就是您需要的所有帮助。如果没有,我建议您在问题中包含文本正文的输出,该输出的格式为代码。

Option Explicit
Public Sub InvestigateEmails()

  ' Outputs properties of selected emails to a file.

  ' ???????  No record of when originally coded
  ' 22Oct16  Output to desktop file rather than Immediate Window.

  ' Technique for locating desktop from answer by Kyle:
  ' http://stackoverflow.com/a/17551579/973283
  ' Needs reference to "Microsoft Scripting Runtime"

  Dim Exp As Explorer
  Dim FileBody As String
  Dim Fso As FileSystemObject
  Dim ItemCrnt As MailItem
  Dim Path As String

  Path = CreateObject("WScript.Shell").specialfolders("Desktop")

  Set Exp = Outlook.Application.ActiveExplorer

  If Exp.Selection.Count = 0 Then
    Call MsgBox("Pleaase select one or more emails then try again", vbOKOnly)
    Exit Sub
  Else
    FileBody = ""
    For Each ItemCrnt In Exp.Selection
      With ItemCrnt
        FileBody = FileBody & "From (Sender): " & .Sender & vbLf
        FileBody = FileBody & "From (Sender name): " & .SenderName & vbLf
        FileBody = FileBody & "From (Sender email address): " & _
                              .SenderEmailAddress & vbLf
        FileBody = FileBody & "Subject: " & CStr(.Subject) & vbLf
        Call OutLongText(FileBody, "Text: ", Replace(Replace(Replace(.Body, vbLf, _
                         "{lf}" & vbLf), vbCr, "{cr}"), vbTab, "{tb}"))
        Call OutLongText(FileBody, "Html: ", Replace(Replace(Replace(.HtmlBody, vbLf, _
                         "{lf}" & vbLf), vbCr, "{cr}"), vbTab, "{tb}"))
        FileBody = FileBody & "--------------------------" & vbLf
      End With
    Next
  End If

  Call PutTextFileUtf8NoBOM(Path & "\InvestigateEmails.txt", FileBody)

End Sub
Public Sub OutLongText(ByRef TextOut As String, ByVal Head As String, _
                       ByVal TextIn As String)

  ' Break TextIn into lines of not more than 100 characters
  ' and append to TextOut

  Dim PosEnd As Long
  Dim LenOut As Long
  Dim PosStart As Long

  If TextIn <> "" Then
    PosStart = 1
    Do While PosStart <= Len(TextIn)
      PosEnd = InStr(PosStart, TextIn, vbLf)
      If PosEnd = 0 Or PosEnd > PosStart + 100 Then
        ' No LF in remainder of TextIn or next 100 characters
        PosEnd = PosStart + 99
        LenOut = 100
      Else
        ' Output upto LF.  Restart output after LF
        LenOut = PosEnd - PosStart
        PosEnd = PosEnd
      End If
      If PosStart = 1 Then
        TextOut = TextOut & Head
      Else
        TextOut = TextOut & Space(Len(Head))
      End If
      TextOut = TextOut & Mid$(TextIn, PosStart, LenOut) & vbLf
      PosStart = PosEnd + 1
    Loop
  End If

End Sub
Public Sub PutTextFileUtf8NoBOM(ByVal PathFileName As String, ByVal FileBody As String)

  ' Outputs FileBody as a text file named PathFileName using
  ' UTF-8 encoding without leading BOM

  ' Needs reference to "Microsoft ActiveX Data Objects n.n Library"
  ' Addition to original code says version 2.5. Tested with version 6.1.

  '  1Nov16  Copied from http://stackoverflow.com/a/4461250/973283
  '          but replaced literals with parameters.
  ' 15Aug17  Discovered routine was adding an LF to the end of the file.
  '          Added code to discard that LF.
  ' 11Oct17  Posted to StackOverflow
  '  9Aug18  Comment from rellampec suggested removal of adWriteLine from
  '          WriteTest statement would avoid adding LF.
  ' 30Sep18  Amended routine to remove adWriteLine from WriteTest statement
  '          and code to remove LF from file. Successfully tested new version.

  ' References: http://stackoverflow.com/a/4461250/973283
  '             https://www.w3schools.com/asp/ado_ref_stream.asp

  Dim BinaryStream As Object
  Dim UTFStream As Object

  Set UTFStream = CreateObject("adodb.stream")

  UTFStream.Type = adTypeText
  UTFStream.Mode = adModeReadWrite
  UTFStream.Charset = "UTF-8"
  UTFStream.Open
  UTFStream.WriteText FileBody

  UTFStream.Position = 3 'skip BOM

  Set BinaryStream = CreateObject("adodb.stream")
  BinaryStream.Type = adTypeBinary
  BinaryStream.Mode = adModeReadWrite
  BinaryStream.Open

  UTFStream.CopyTo BinaryStream

  UTFStream.Flush
  UTFStream.Close
  Set UTFStream = Nothing

  BinaryStream.SaveToFile PathFileName, adSaveCreateOverWrite
  BinaryStream.Flush
  BinaryStream.Close
  Set BinaryStream = Nothing

End Sub

答案 1 :(得分:0)

以这种格式提供邮件:

  

来自:webfeedback@XXXXX.com
  发送:2018年11月1日星期四10:20 AM
  致:Joe
  主题:2018年TEAM证书

     

2018年11月1日,星期四-10:20

     

您希望您的名字出现在参与证书上吗?乔·姓氏
  电子邮件地址为必填项ojoelastname@XXXXXXXXX.com

您可以使用“:”定界符来自定义用于结构化行的代码。

Option Explicit

Sub CopyToExcel13()

' With a reference to Excel Object Library

Dim xlApp As Excel.Application
Dim xlWB As Excel.Workbook
Dim xlSheet As Excel.Worksheet

Dim olItem As MailItem

Dim vText As Variant
Dim sText As String
Dim vItem As Variant

Dim i As Long
Dim RowCount As Long

Dim bXStarted As Boolean

Dim FilePath As String

FilePath = "D:\My Documents\Book1.xlsx" 'the path of the xl workbook'

If ActiveExplorer.Selection.count = 0 Then
    MsgBox "No Items selected!", vbCritical, "Error"

    Exit Sub    ' <--

End If

On Error Resume Next
Set xlApp = getObject(, "Excel.Application")
' Discontinue error bypass as soon as possible
On Error GoTo 0

If xlApp Is Nothing Then    ' <--
    Set xlApp = CreateObject("Excel.Application")
    bXStarted = True
End If

' Comment out as applicable
xlApp.Visible = True
xlApp.ScreenUpdating = True

'// Open the workbook to input the data
Set xlWB = xlApp.Workbooks.Open(FilePath) ' Open xlFile
Set xlSheet = xlWB.Sheets("Sheet1") ' use Sheet1 or Sheet name

'// Process each selected Mail Item
For Each olItem In ActiveExplorer.Selection

    sText = olItem.body ' Email Body
    vText = Split(sText, Chr(13)) ' Chr(13) = Carriage return

    '// Find the next empty line of the worksheet
    RowCount = xlSheet.Range("A" & xlSheet.Rows.count).End(xlUp).Row
    RowCount = RowCount + 1

    '// Check each line of text in the message body down loop
    For i = UBound(vText) To LBound(vText) Step -1

        Debug.Print i & ": " & vText(i)

        '// InStr([start,]mainString, SearchedString[, compare])
        If InStr(1, vText(i), "CERTIFICATE OF PARTICIPATION?") > 1 Then
            vItem = Split(vText(i), "CERTIFICATE OF PARTICIPATION?")
            '// Trim = String whose both side spaces needs to be trimmed
            xlSheet.Range("A" & RowCount) = Trim(vItem(1))
        End If

        '// Email Address Required
        If InStr(1, vText(i), "Email Address Required") > 0 Then
            vItem = Split(vText(i), "Email Address Required")
            xlSheet.Range("B" & RowCount) = Trim(vItem(1))
        End If

        'Sent date
        If InStr(1, vText(i), "Sent:") > 0 Then
            vItem = Split(vText(i), Chr(58)) ' Chr(58) = :
            xlSheet.Range("C" & RowCount) = Trim(vItem(1))
        End If

    Next i

    xlWB.Save

Next olItem

'// Save & close workbook
'xlWB.Close SaveChanges:=True
'If bXStarted Then
'    xlApp.Quit
'End If

'// Cleanup
Set xlApp = Nothing
Set xlWB = Nothing
Set xlSheet = Nothing
Set olItem = Nothing

End Sub