这是我的样本文件!
col1,col2,colx,col3,col4,col5
1,A,,AA,X,Y
2,B,,,*/;wBB,D --invalid or bad
3,E,,,....;*()//FF,Y --invalid or bad
4,G,,,.,;'()XX,P --invalid or bad
5,P,Kk,,...(),D
After following Instruction from here我有
2,B,,,BB,D
3,E,,,FF,Y
4,G,,,XX,P
作为Csv文件中的错误数据,我的任务是通过拆分每列验证记录并检查额外的分隔符,如果找到则删除分隔符
我试过了!
Sub File validation()
Dim goFS: Set goFS = CreateObject("Scripting.FileSystemObject") ' (2)
Dim tsIn: Set tsIn = goFS.OpenTextFile("....bad.csv")
Do Until tsIn.AtEndOfStream
sLine = tsIn.ReadLine()
If sLine = EOF then exit else Loop ' I get a error here
Dim str : strconv(sLine) 'error
End Sub
Function strConv(ByVal str As String) As String
Dim objRegEx As Object, allMatches As Object
Set objRegEx = CreateObject("VBScript.RegExp")
With objRegEx
.MultiLine = False
.IgnoreCase = False
.Global = True
.Pattern = ",,,"
End With
strConv = objRegEx.Replace(str, ",,")
End Function
我需要有或没有Regex的解决方案来验证此文件并将其放回源文件中!
我是非常新的vba脚本可以有人帮助我!
验证后我需要文件看起来像这样
col1,col2,colx,col3,col4,col5
1,A,,AA,X,Y
2,B,,BB,D,
3,E,,FF,Y,
4,G,,XX,P,
5,P,Kk,,,D
答案 0 :(得分:0)
你是说没有colX值的行是“坏”吗?看起来他们没有价值。无论如何,您可以轻松地检查colX中的值。
Do While Not tsIn.AtEndOfStream
' Read and split the line...
a = Split(tsIn.ReadLine, ",")
' Check for a value in "colX"...
If Len(Trim(a(2))) = 0 Then
' Not sure what you want to do here. Replace it with another value?
a(2) = "0"
End If
' Write the line to another file...
tsOut.WriteLine Join(a, ",")
Loop
答案 1 :(得分:0)
一个'实验函数'(see here)来计算RegExp将坏转换为好的行:
Function demoRegExp()
demoRegExp = 0
Dim aTests : aTests = Array( _
"2,B,,,BB,D", "2,B,,BB,D," _
, "3,E,,,FF,Y", "3,E,,FF,Y," _
, "field,no comma here,,,what,ever", "field,no comma here,,what,ever," _
)
Dim sC : sC = ","
Dim sF : sF = "[^,]+"
Dim r : Set r = New RegExp
r.Pattern = Join(Array("^(", sF, sC, sF, sC, sC, ")(", sC, ")(", sF, sC, sF, ")$"), "")
WScript.Echo "pattern:", qq(r.Pattern)
Dim i
For i = 0 To UBound(aTests) Step 2
Dim sInp : sInp = aTests(i + 0)
Dim sExp : sExp = aTests(i + 1)
Dim sAct : sAct = r.Replace(sInp, "$1$3$2")
WScript.Stdout.Write qq(sInp) & " => " & qq(sAct)
If sAct = sExp Then
WScript.Echo " ok"
Else
WScript.Echo " Fail - exp:", qq(sExp)
End If
Next
End Function
输出:
pattern: "^([^,]+,[^,]+,,)(,)([^,]+,[^,]+)$"
"2,B,,,BB,D" => "2,B,,BB,D," ok
"3,E,,,FF,Y" => "3,E,,FF,Y," ok
"field,no comma here,,,what,ever" => "field,no comma here,,what,ever," ok