拆分字符串引号错误

时间:2014-05-09 15:20:07

标签: vb.net

TL:博士 当它不是引号(“,”)或逗号引号(,“)时,如何以编程方式标记引号(”)?

我正在运行一个打开csv文件的程序,读取每一行,然后根据逗号的位置拆分该行。有足够的文本字符串,其中包含引号,所以我正在使用 filereader1.HasFieldsEnclosedInQuotes = True 但是,在创建文件时,不考虑行中的偶数引号。大多数时候,这不是什么大不了的事。每个文件夹只有几个实例 但是,我遇到了一些数量庞大的地方。几千行文件中的几十个实例。没有一种简单的方法来手动错误检查这些。 所以,我正在尝试验证字符串是否有流氓引号。一个逗号引用(,“)或引号 - 逗号(”)就可以了,但只是浮动的引号(“)会拉出一个输入框,显示手动修复的文本行。
我不能使用奇数引号,因为我发现偶数错误引号。

下面是代码。

                Using filereader1 As New Microsoft.VisualBasic.FileIO.TextFieldParser(files_(i))

                filereader1.TextFieldType = FieldType.Delimited
                filereader1.Delimiters = New String() {","}
                filereader1.HasFieldsEnclosedInQuotes = True


                While Not filereader1.EndOfData


                    'While (filereader1.EndOfData = False) ' looks for the end of the file and resets stuff
                    split_string = filereader1.ReadFields() 

这是我的想法。 我想运行readline而不是readfield,我会将它分配给变量。如果readline有引号,但不能是引号 - 逗号或逗号引号,则该变量将显示在输入框中以进行手动更新。然后将固定变量解析为split_string数组 如果引号都符合上面的规则,则字符串将被正常解析。

2 个答案:

答案 0 :(得分:0)

你能否计算readLine中不同类型的字符串的数量,如果所有引号的数量与所有“,和”的总和不相符,那么你有问题吗?

Public Function CountChar(originalString As String, findString As String) as Long
    Dim lLen As Long = 0
    Dim lCharLen As Long = 0
    Dim lAns As Long = 0
    Dim sChar As String = ""
    Dim lCtr As Long = 0
    Dim lEndOfLoop As Long = 0

    lLen = Len(originalString)
    lCharLen = Len(findString)
    lEndOfLoop = (lLen - lCharLen) + 1

    For lCtr = 1 To lEndOfLoop
        sChar = Mid(originalString, lCtr, lCharLen)
        If StrComp(sChar, findString, vbTextCompare) = 0 Then lAns = lAns + 1
    Next
    return lAns
End Function

用法

'if the count of all quotes does not equal count of ", + ,", then there is an issue.
if CountChar(thisLine, chr(34)) <> (countChar(thisLine, chr(34) & ",") + countChar(thisLine, & "," & chr(34)) then
    'rogue quotes
end if

答案 1 :(得分:0)

所以,这就是我最终要做的事 我从csv文件中读取每一行。 我查看该行中有多少引号 如果数字为零,我仅基于逗号进行解析。 如果有奇数引号,我会删除该行中的所有引号,并将其发送到手动错误检查。
如果有偶数引号,我用::替换字符串“和” 然后我解析两个逗号和:: 这似乎有效。

            Using filereader As New Microsoft.VisualBasic.FileIO.TextFieldParser(files_(i), System.Text.Encoding.Default) 'system text decoding adds odd characters
            While Not filereader.EndOfData
                filereader.TextFieldType = FieldType.Delimited
                'filereader.Delimiters = New String() {","}
                filereader.SetDelimiters(",") 'tried new from Don's program 6/12
                filereader.HasFieldsEnclosedInQuotes = True



                filereader.TextFieldType = FieldType.Delimited




                Try
                    'split_string = filereader1.ReadFields()

                    whole_string = filereader.ReadLine()

                Catch ex As Microsoft.VisualBasic.FileIO.MalformedLineException


                    MessageBox.Show(ex.Message & " : " & FileName & " ; " & filereader.ErrorLine)
                    error_throw = filereader.ErrorLine
                    error_throw = error_throw.Replace("""", " ")
                    split_string = Split(error_throw, ",")
                    'MsgBox("In catch routine, split string (0) " & split_string(0))
                End Try

                Dim cnt As Integer = 0
                Dim MyRegex As New Regex("""[^""]*""|(,)")

                For Each c As Char In whole_string
                    If c = """" Then cnt = cnt + 1
                Next
                'MsgBox("cnt = " & cnt)
                If cnt = 0 Then 'no quotes
                    split_string = Split(whole_string, ",") 'split by commas
                    'MsgBox("no quotes")
                ElseIf cnt Mod 2 = 0 Then 'even number of quotes
                    Dim pattern1 As String = "\.?(,"")+"
                    Dim pattern2 As String = "\.?("",)+"
                    Dim rgex1 As New Regex(pattern1)
                    Dim rgex2 As New Regex(pattern2)
                    Dim replace1 As String = "::"
                    Dim replace2 As String = "::"


                    Dim whole_string1 As String = rgex1.Replace(whole_string, replace1)
                    Dim whole_string2 As String = rgex2.Replace(whole_string1, replace2)

                    whole_string1 = rgex1.Replace(whole_string, replace1)
                    whole_string2 = rgex2.Replace(whole_string1, replace2)


                    'MsgBox(whole_string & " >> " & whole_string1 & " >> " & whole_string2)
                    'split_string = Split(whole_string2, ",") 'non-regex code that allows program to run

                    split_string = Regex.Split(whole_string2, ",|([<::>+].*[<::>+])")
                    '(",(?=(?:[^\""]*\""[^\""]*\"")*(?![^\""]*\""))")
                    'MsgBox("Before " & split_string(0) & " | " & split_string(1) & " | " & split_string(2) & " | " & split_string(3) & " | " & split_string(4) & " | " & split_string(5) & " | " & split_string(6) & " | " & split_string(7))
                    Dim arraycount_2 As Integer = split_string.getupperbound(0)

                    For p = 0 To arraycount_2

                        split_string(p) = split_string(p).replace("::", "")

                    Next

                    'MsgBox("After " & split_string(0) & " | " & split_string(1) & " | " & split_string(2) & " | " & split_string(3) & " | " & split_string(4) & " | " & split_string(5) & " | " & split_string(6) & " | " & split_string(7))
                ElseIf cnt Mod 2 <> 0 Then 'odd number of quotes
                    'MsgBox("Odd quotes")
                    whole_string = whole_string.Replace("""", " ") 'delete all quotes
                    split_string = Split(whole_string, ",") 'split by commas
                Else
                    ' MsgBox("no answer to ENTRY splitting of Whole_string")
                End If