使用VBA从文本文件写入excel时保留“列”

时间:2018-10-17 20:48:25

标签: regex excel vba excel-vba

我有一个文本文件,其格式如下:

enter image description here

我正在VBA中使用以下代码将文本文件写入excel:

Sub Test()

 Dim Fn As String, WS As Worksheet, st As String

 Fn = "Path.txt" ' the file path and name
 Set WS = Sheets("Sheet1")

 'Read text file to st string
 With CreateObject("Scripting.FileSystemObject")
    If Not .FileExists(Fn) Then
        MsgBox Fn & "  : is missing."
        Exit Sub
    Else
        If FileLen(Fn) = 0 Then
            MsgBox Fn & "  : is empty"
            Exit Sub
        Else
            With .OpenTextFile(Fn, 1)
             st = .ReadAll
             .Close
            End With
        End If
    End If
 End With

 'Replace every two or more space in st string with vbTab
 With CreateObject("VBScript.RegExp")
  .Pattern = "[ ]{2,}"
  .Global = True
  .Execute st
  st = .Replace(st, vbTab)
 End With

 'Put st string in Clipboard
 With CreateObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
    .SetText st
    .PutInClipboard
 End With

 'Paste Clipboard to range
 WS.Range("A1").PasteSpecial

End Sub

我的目标是保留Excel中文本文件中的列。

但是,我的代码无法判断Plan Type下的空白和Benefit Plan下的空白实际上是两个不同的数据列。它将两列下的空白视为一个长的空白,并且不保留格式。

从视觉上我们知道有列,但是我的代码看不到这一点。

有没有一种编程方法,它可以识别文本文件中有两个空格而不是一个大空格?

我要避免的是必须手动用字符删除它。有可能吗?

5 个答案:

答案 0 :(得分:5)

假设每列的长度为10个字符,我将使用此宽度而不是空格分隔符

Sub FeedTextFileToActiveSheet(ByVal TextFile As String)
  Dim i As Integer, Line As String
  Open TextFile For Input As #1
  While Not EOF(#1)
    i = i + 1
    Input #1, Line
    Range("A" & i) = Trim(Mid(Line, 1, 10))  'Business ID
    Range("B" & i) = Trim(Mid(Line, 11, 10)) 'Employee ID
    ' ... and so on
  Wend
  Close #1
End Sub

要使用它,只需致电FeedTextFileToActiveSheet("Path.txt")

答案 1 :(得分:3)

您是否尝试过Excel的“从文本文件导入选项”? 如果您只想将文本文件导入到带有或不带有标题的excel中,则可以使用can import directly in excel using the built in option available in excel。这可以正确识别标题和空格。要注意的一点是,文本文件的标题应始终位于开头此方法的行。 如果不确定这一点,则可以使用vba脚本。如果是这样,则ferdinando提供的链接将为您提供帮助。

答案 2 :(得分:2)

如果您以直观的方式组织此文件,那么我会遵循这种逻辑。这意味着列的值从以下位置开始 列标题开始。这意味着一列的值在下一列的开始处结束。

有用的图像,描述了逻辑(也包括我使用的示例文本文件):

enter image description here

所有这些逻辑都可以通过读取包含标头的第一行并确定以下内容的所有索引来完成: 每个标题的开头。然后,对于每一行,我们可以轻松确定两个特定索引之间的值, 进行修剪并删除值开头和结尾的多余空格。

尝试以下代码(代码中所有必要的注释):

Sub ReadDataFromCsv()
    Dim Fn As String, WS As Worksheet, st As String, i As Long, columnHeadersIndexes As Object, numberOfColumns As Long
    Fn = "your path here" ' the file path and name
    Set WS = Sheets("Sheet1")
    ' Create array that will hold indexes of a beginning of a column header
    Set columnHeadersIndexes = CreateObject("System.Collections.ArrayList")
    'Read text file to st string
    With CreateObject("Scripting.FileSystemObject")
        If Not .FileExists(Fn) Then
            MsgBox Fn & "  : is missing."
            Exit Sub
        ElseIf FileLen(Fn) = 0 Then
            MsgBox Fn & "  : is empty"
        Else
            With .OpenTextFile(Fn, 1)
                ' Read first line
                st = .ReadLine
                i = 1
                ' Find beginning of first column name
                Do While Mid(st, i, 1) = " "
                    i = i + 1
                Loop
                columnHeadersIndexes.Add (i)
                ' At least two spaces separate two headers, so we can safely add 2 without risk of loosing any letters frmo next header
                i = i + 2
                Dim j As Long: j = 1
                Do While i < Len(st)
                    ' If we have two spaces followed by non-space, then save index (beginning of a header)
                    If Mid(st, i - 2, 2) = "  " And Mid(st, i, 1) <> " " Then
                        ' Set column header
                        Cells(1, j) = Mid(st, columnHeadersIndexes(columnHeadersIndexes.Count - 1), i - columnHeadersIndexes(columnHeadersIndexes.Count - 1) - 1)
                        columnHeadersIndexes.Add (i)
                        j = j + 1
                    End If
                    i = i + 1
                Loop
                ' Set column header
                Cells(1, j) = Trim(Mid(st, columnHeadersIndexes(columnHeadersIndexes.Count - 1), Len(st)))
                numberOfColumns = columnHeadersIndexes.Count
                ' Skip line with ------ characters
                .ReadLine
                Dim currentRow As Long: currentRow = 2
                Do While .AtEndOfStream <> True
                    st = .ReadLine
                    ' Read all columns from a line
                    For i = 0 To numberOfColumns - 2
                        If Len(st) >= columnHeadersIndexes(i) Then
                            cellValue = Mid(st, columnHeadersIndexes(i), columnHeadersIndexes(i + 1) - columnHeadersIndexes(i) - 1)
                            cellValue = Trim(cellValue)
                            Cells(currentRow, i + 1) = cellValue
                        End If
                    Next
                    ' Read last column, if exists
                    If Len(st) >= columnHeadersIndexes(i) Then
                        'here we pass Len(st) as length for substring - it assures that we don't pass too small value and miss some characters
                        cellValue = Mid(st, columnHeadersIndexes(i), Len(st))
                        cellValue = Trim(cellValue)
                        Cells(currentRow, i + 1) = cellValue
                    End If
                    currentRow = currentRow + 1
                Loop
                .Close
            End With
        End If
    End With
End Sub

答案 3 :(得分:1)

如果在记事本中打开文件时,文件看上去与图像完全相似,则很可能是固定宽度。不管是哪种情况,最好还是准备一个空白工作簿,启动“记录宏”,然后尝试打开文本文件。自动文本导入向导将打开。选择固定宽度(最好为定宽)或定界的类型,仔细阅读所提供的指导说明,仔细阅读每一步。 (当要求从行开始导入时,最好提供包含重要数据的第一行,省略标题行等)。文件完全打开后,停止录制。您将有一个这样录制的宏。

Workbooks.OpenText Filename:="C:\Users\user\Desktop\Text.prn", Origin:= _
        xlMSDOS, StartRow:=5, DataType:=xlFixedWidth, FieldInfo:=Array(Array(0, 1) _
        , Array(14, 1), Array(29, 1), Array(44, 1), Array(59, 1), Array(74, 5), Array(89, 1), Array( _
        104, 1)), TrailingMinusNumbers:=True

现在,只需在您的过程中使用该部分代码(可能在文件名等方面进行少量修改)即可打开文本文件。然后只需复制当前区域,然后粘贴到已准备好带有标题等的工作表中即可。

ActiveWorkbook.ActiveSheet.Range("A1").CurrentRegion.Copy ThisWorkbook.Sheets(1).Range("a5")
 ActiveWorkbook.Close False

答案 4 :(得分:0)

您可以:

  • 处理所有带有“-”的行,以获取实际的字段宽度

  • 将所有文本内容粘贴到所需的工作表A列

  • 使用TextToColumns()方法将文本从A列扩展到所需的列,具体取决于对“-”行的正确处理

如下:

Option Explicit

Sub Test()

    Dim Fn As String, WS As Worksheet
    Dim lines As Variant, line As Variant

    Fn = "Path.txt" ' the file path and name
    Set WS = Sheets("Sheet1")

    'Read text file to st string
    With CreateObject("Scripting.FileSystemObject")
       If Not .FileExists(Fn) Then
           MsgBox Fn & "  : is missing."
           Exit Sub
       Else
           If FileLen(Fn) = 0 Then
               MsgBox Fn & "  : is empty"
               Exit Sub
           Else
                With .OpenTextFile(Fn, 1)
                    lines = Split(.readall, vbLf)
                    .Close
                End With
           End If
       End If
    End With

    For Each line In lines ' loop through all text lines
        If InStr(line, "-") > 0 Then Exit For ' loop till you reach the "-"s line, which will be used to get FieldInfo array for textToColumns method
    Next

    With WS
        .Range("a1").Resize(UBound(lines) + 1).Value = Application.Transpose(lines) ' copy all text lines into column A rows
        .Columns(1).TextToColumns Destination:=Range("A1"), DataType:=xlFixedWidth, FieldInfo:=GetFieldInfo(Trim(line)), TrailingMinusNumbers:=True ' convert text to columns feeding FieldInfo array arranged from the "-"s line structure
    End With

End Sub


Function GetFieldInfo(st As String) As Variant()
    Dim i As Long, n As Long, nFields As Long

    nFields = UBound(Split(WorksheetFunction.Trim(st), " ")) ' get the number of fields by counting the "-"s groups separated by single space

    ReDim arrtext(0 To nFields) ' size FieldInfo array accordingly
    Do
        arrtext(i) = Array(n, 1) ' build current FieldInfo array field with current field position in text
        n = InStr(n + 1, st, " -") ' search next field position
        i = i + 1
    Loop While i < nFields
    arrtext(i) = Array(n, 1) ' build last FieldInfo array field with last field position in text

    GetFieldInfo = arrtext ' return FieldInfo array
End Function