我正在寻找在VBA中加载格式化数据的最佳方法。我花了很多时间试图找到类似C-like或类似Fortran的fscanf
类型函数,但没有成功。
基本上我想从文本文件中读取数百万个数字,这些数字放在许多(100,000)行上,每行10个数字(最后一行除外,可能是1-10个数字)。数字用空格分隔,但我事先并不知道每个字段的宽度(这个宽度在数据块之间变化)。 e.g。
397143.1 396743.1 396343.1 395943.1 395543.1 395143.1 394743.1 394343.1 393943.1 393543.1
-0.11 -0.10 -0.10 -0.10 -0.10 -0.09 -0.09 -0.09 -0.09 -0.09
0.171 0.165 0.164 0.162 0.158 0.154 0.151 0.145 0.157 0.209
以前我使用过Mid
函数,但在这种情况下我不能,因为我事先并不知道每个字段的宽度。此外,在Excel工作表中加载太多行。我可以想到一种蛮力的方式,我可以看到每个连续的角色并确定它是空格还是数字,但它看起来非常笨拙。
我也对如何编写格式化数据的指针感兴趣,但这似乎更容易 - 只需格式化每个字符串并使用&
连接它们。
答案 0 :(得分:4)
以下代码段将从文本文件中读取以空格分隔的数字:
Dim someNumber As Double
Open "YourDataFile.txt" For Input As #1
Do While Not (EOF(1))
Input #1, someNumber
`// do something with someNumber here...`
Loop
Close #1
更新:以下是您一次只读一行的方法,每行都有可变数量的项目:
Dim someNumber As Double
Dim startPosition As Long
Dim endPosition As Long
Dim temp As String
Open "YourDataFile" For Input As #1
Do While Not (EOF(1))
startPosition = Seek(1) '// capture the current file position'
Line Input #1, temp '// read an entire line'
endPosition = Seek(1) '// determine the end-of-line file position'
Seek 1, startPosition '// jump back to the beginning of the line'
'// read numbers from the file until the end of the current line'
Do While Not (EOF(1)) And (Seek(1) < endPosition)
Input #1, someNumber
'// do something with someNumber here...'
Loop
Loop
Close #1
答案 1 :(得分:2)
您还可以使用正则表达式将多个空格替换为一个空格,然后对每一行使用拆分函数,如下面的示例代码所示。
处理完65000行后,新工作表将添加到Excel工作簿中,因此源文件可能大于Excel中的最大行数。
Dim rx As RegExp
Sub Start()
Dim fso As FileSystemObject
Dim stream As TextStream
Dim originalLine As String
Dim formattedLine As String
Dim rowNr As Long
Dim sht As Worksheet
Dim shtCount As Long
Const maxRows As Long = 65000
Set fso = New FileSystemObject
Set stream = fso.OpenTextFile("c:\data.txt", ForReading)
rowNr = 1
shtCount = 1
Set sht = Worksheets.Add
sht.Name = shtCount
Do While Not stream.AtEndOfStream
originalLine = stream.ReadLine
formattedLine = ReformatLine(originalLine)
If formattedLine <> "" Then
WriteValues formattedLine, rowNr, sht
rowNr = rowNr + 1
If rowNr > maxRows Then
rowNr = 1
shtCount = shtCount + 1
Set sht = Worksheets.Add
sht.Name = shtCount
End If
End If
Loop
End Sub
Function ReformatLine(line As String) As String
Set rx = New RegExp
With rx
.MultiLine = False
.Global = True
.IgnoreCase = True
.Pattern = "[\s]+"
ReformatLine = .Replace(line, " ")
End With
End Function
Function WriteValues(formattedLine As String, rowNr As Long, sht As Worksheet)
Dim colNr As Long
colNr = 1
stringArray = Split(formattedLine, " ")
For Each stringItem In stringArray
sht.Cells(rowNr, colNr) = stringItem
colNr = colNr + 1
Next
End Function