TL:博士 当它不是引号(“,”)或逗号引号(,“)时,如何以编程方式标记引号(”)?
我正在运行一个打开csv文件的程序,读取每一行,然后根据逗号的位置拆分该行。有足够的文本字符串,其中包含引号,所以我正在使用
filereader1.HasFieldsEnclosedInQuotes = True
但是,在创建文件时,不考虑行中的偶数引号。大多数时候,这不是什么大不了的事。每个文件夹只有几个实例
但是,我遇到了一些数量庞大的地方。几千行文件中的几十个实例。没有一种简单的方法来手动错误检查这些。
所以,我正在尝试验证字符串是否有流氓引号。一个逗号引用(,“)或引号 - 逗号(”)就可以了,但只是浮动的引号(“)会拉出一个输入框,显示手动修复的文本行。
我不能使用奇数引号,因为我发现偶数错误引号。
下面是代码。
Using filereader1 As New Microsoft.VisualBasic.FileIO.TextFieldParser(files_(i))
filereader1.TextFieldType = FieldType.Delimited
filereader1.Delimiters = New String() {","}
filereader1.HasFieldsEnclosedInQuotes = True
While Not filereader1.EndOfData
'While (filereader1.EndOfData = False) ' looks for the end of the file and resets stuff
split_string = filereader1.ReadFields()
这是我的想法。 我想运行readline而不是readfield,我会将它分配给变量。如果readline有引号,但不能是引号 - 逗号或逗号引号,则该变量将显示在输入框中以进行手动更新。然后将固定变量解析为split_string数组 如果引号都符合上面的规则,则字符串将被正常解析。
答案 0 :(得分:0)
你能否计算readLine中不同类型的字符串的数量,如果所有引号的数量与所有“,和”的总和不相符,那么你有问题吗?
Public Function CountChar(originalString As String, findString As String) as Long
Dim lLen As Long = 0
Dim lCharLen As Long = 0
Dim lAns As Long = 0
Dim sChar As String = ""
Dim lCtr As Long = 0
Dim lEndOfLoop As Long = 0
lLen = Len(originalString)
lCharLen = Len(findString)
lEndOfLoop = (lLen - lCharLen) + 1
For lCtr = 1 To lEndOfLoop
sChar = Mid(originalString, lCtr, lCharLen)
If StrComp(sChar, findString, vbTextCompare) = 0 Then lAns = lAns + 1
Next
return lAns
End Function
用法
'if the count of all quotes does not equal count of ", + ,", then there is an issue.
if CountChar(thisLine, chr(34)) <> (countChar(thisLine, chr(34) & ",") + countChar(thisLine, & "," & chr(34)) then
'rogue quotes
end if
答案 1 :(得分:0)
所以,这就是我最终要做的事
我从csv文件中读取每一行。
我查看该行中有多少引号
如果数字为零,我仅基于逗号进行解析。
如果有奇数引号,我会删除该行中的所有引号,并将其发送到手动错误检查。
如果有偶数引号,我用::替换字符串“和”
然后我解析两个逗号和::
这似乎有效。
Using filereader As New Microsoft.VisualBasic.FileIO.TextFieldParser(files_(i), System.Text.Encoding.Default) 'system text decoding adds odd characters
While Not filereader.EndOfData
filereader.TextFieldType = FieldType.Delimited
'filereader.Delimiters = New String() {","}
filereader.SetDelimiters(",") 'tried new from Don's program 6/12
filereader.HasFieldsEnclosedInQuotes = True
filereader.TextFieldType = FieldType.Delimited
Try
'split_string = filereader1.ReadFields()
whole_string = filereader.ReadLine()
Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException
MessageBox.Show(ex.Message & " : " & FileName & " ; " & filereader.ErrorLine)
error_throw = filereader.ErrorLine
error_throw = error_throw.Replace("""", " ")
split_string = Split(error_throw, ",")
'MsgBox("In catch routine, split string (0) " & split_string(0))
End Try
Dim cnt As Integer = 0
Dim MyRegex As New Regex("""[^""]*""|(,)")
For Each c As Char In whole_string
If c = """" Then cnt = cnt + 1
Next
'MsgBox("cnt = " & cnt)
If cnt = 0 Then 'no quotes
split_string = Split(whole_string, ",") 'split by commas
'MsgBox("no quotes")
ElseIf cnt Mod 2 = 0 Then 'even number of quotes
Dim pattern1 As String = "\.?(,"")+"
Dim pattern2 As String = "\.?("",)+"
Dim rgex1 As New Regex(pattern1)
Dim rgex2 As New Regex(pattern2)
Dim replace1 As String = "::"
Dim replace2 As String = "::"
Dim whole_string1 As String = rgex1.Replace(whole_string, replace1)
Dim whole_string2 As String = rgex2.Replace(whole_string1, replace2)
whole_string1 = rgex1.Replace(whole_string, replace1)
whole_string2 = rgex2.Replace(whole_string1, replace2)
'MsgBox(whole_string & " >> " & whole_string1 & " >> " & whole_string2)
'split_string = Split(whole_string2, ",") 'non-regex code that allows program to run
split_string = Regex.Split(whole_string2, ",|([<::>+].*[<::>+])")
'(",(?=(?:[^\""]*\""[^\""]*\"")*(?![^\""]*\""))")
'MsgBox("Before " & split_string(0) & " | " & split_string(1) & " | " & split_string(2) & " | " & split_string(3) & " | " & split_string(4) & " | " & split_string(5) & " | " & split_string(6) & " | " & split_string(7))
Dim arraycount_2 As Integer = split_string.getupperbound(0)
For p = 0 To arraycount_2
split_string(p) = split_string(p).replace("::", "")
Next
'MsgBox("After " & split_string(0) & " | " & split_string(1) & " | " & split_string(2) & " | " & split_string(3) & " | " & split_string(4) & " | " & split_string(5) & " | " & split_string(6) & " | " & split_string(7))
ElseIf cnt Mod 2 <> 0 Then 'odd number of quotes
'MsgBox("Odd quotes")
whole_string = whole_string.Replace("""", " ") 'delete all quotes
split_string = Split(whole_string, ",") 'split by commas
Else
' MsgBox("no answer to ENTRY splitting of Whole_string")
End If