Excel字符串操作以检查数据一致性

时间:2010-06-01 09:55:58

标签: mysql excel string

背景资料: - 有近7000名个人,并且有一两次或三次测试的数据。

每个人都进行了第一次测试(我们称之为测试M )。一些参加过测试M的人也参加了测试I ,而参加测试I的一些人也参加了测试B

对于前两个测试(M和I),学生可以获得成绩I,II或III 。根据成绩,他们被授予积分 - 一年级为3,二级为二级,三级为一级

最后一个测试B只是一个没有成绩的通过或失败结果。通过此测试的人获得1分,没有失败的分数。 (实际上,等级被授予,但所有等级都给予共同的1分)。

业余爱好者已输入数据以在Excel文件中表示这些学生及其成绩。问题是,这个人做了最糟糕的事情 - 他已经开发了自己的符号并在一个单元格中输入了所有测试信息 - 并让我的生活变得地狱。

该文件最初有两个文本列,一个用于个人ID,另一个用于测试信息,如果可以调用它。

alt text http://i48.tinypic.com/5tv0bl.png我知道这太可怕了,我很痛苦。在图像中,如果您看到“M-II-2 I-III-1”,则表示该人在测试M中获得2分,在测试I中获得1分,获得1分。有些只进行了一次测试,有的只有两次,有的是三次。

当文件来找我处理和分析学生的表现时,我发回了指示,插入3个额外的栏目,只有三个考试的成绩。该文件现在看起来如下。 C和D列分别代表1,2和3的I,II和III级。 C栏用于测试M,D栏用于测试I.如果个人通过了测试B,则栏E表示BA(B已实现!)。

alt text http://i50.tinypic.com/16c0yvr.png

现在你已经掌握了上述信息,让我们来解决问题。我不相信这一点,想要检查B列中的数据是否与C,D和E列中的数据匹配。

也就是说,我想检查B列中的字符串,找出C,D和E列中的数字是否正确。

非常感谢所有帮助。

P.S。 - 我已经通过ODBC将它导出到MySQL,这就是你看到那些NULL的原因。我也尝试在MySQL中这样做,并且真的会接受MySQL或Excel解决方案,我没有偏好。

Edit : - See file with sample data

1 个答案:

答案 0 :(得分:0)

从原始数据创建平面文件:

Sub GetData()
    Dim cn As Object
    Dim rs As Object
    Dim strFile As String
    Dim strCon As String
    Dim strSQL As String
    Dim s As String, t As Variant, x As Variant
    Dim i As Integer, j As Integer, k As Integer

    ''This is not the best way to refer to the workbook
    ''you want, but it is very conveient for notes
    ''It is probably best to use the name of the workbook.

    strFile = ActiveWorkbook.FullName

    ''Note that if HDR=No, F1,F2 etc are used for column names,
    ''if HDR=Yes, the names in the first row of the range
    ''can be used.
    ''This is the Jet 4 connection string, you can get more
    ''here : http://www.connectionstrings.com/excel

    strCon = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & strFile _
        & ";Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1"";"

    ''Late binding, so no reference is needed

    Set cn = CreateObject("ADODB.Connection")
    Set rs = CreateObject("ADODB.Recordset")


    cn.Open strCon

    strSQL = "SELECT * " _
           & "FROM [Sheet1$] "

    ''Open the recordset for more processing
    ''Cursor Type: 3, adOpenStatic
    ''Lock Type: 3, adLockOptimistic
    ''Not everything can be done with every cursor type and
    ''lock type. See http://www.w3schools.com/ado/met_rs_open.asp

    rs.Open strSQL, cn, 3, 3

    ''Pick a suitable empty worksheet for the results

    With Worksheets("Sheet2")

        ''Fill headers into the first row of the worksheet


        .Cells(1, 1) = "ID"
        .Cells(1, 2) = "Exam"
        .Cells(1, 3) = "Grade"
        .Cells(1, 4) = "Points"

        ''Working with the recordset ...

        ''Counter for Fields/Columns in Recordset and worksheet
        ''Row one is used with titles, so ...
        i = 1

        Do While Not rs.EOF


            ''Store the ID to a string (if it is a long,
            ''change the type) ...

            s = rs!ID

            t = Split(rs!testinfo, " ")

            For j = 0 To UBound(t)
                ''(Counter)
                i = i + 1

                .Cells(i, 1) = s

               x = Split(t(j), "-")

                For k = 0 To UBound(x)
                    If t(j) = "BA-1" Then
                        .Cells(i, 2) = "B"
                        .Cells(i, 3) = "A"
                        .Cells(i, 4) = 1
                    Else
                        .Cells(i, k + 2) = x(k)
                    End If
                Next
            Next


            ''Keep going 
            rs.MoveNext

        Loop

   ''Finished with the sheet
   End With

   ''Tidy up
   rs.Close
   Set rs = Nothing
   cn.Close
   Set cn = Nothing
End Sub

检查额外的列:

Sub CheckData()
    Dim cn As Object
    Dim rs As Object
    Dim strFile As String
    Dim strCon As String
    Dim strSQL As String
    Dim s As String, t As Variant, x As Variant
    Dim i As Integer, j As Integer, k As Integer
    Dim BAErr, MErr, IErr

    strFile = ActiveWorkbook.FullName

    strCon = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & strFile _
        & ";Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1"";"

    Set cn = CreateObject("ADODB.Connection")
    Set rs = CreateObject("ADODB.Recordset")

    cn.Open strCon

    strSQL = "SELECT * " _
           & "FROM [Sheet1$] "

    rs.Open strSQL, cn, 3, 3

    Do While Not rs.EOF

        t = Split(rs!testinfo, " ")

        For j = 0 To UBound(t)
           x = Split(t(j), "-")

           Select Case x(0)
                Case "BA"
                    If rs![test b] <> "BA" Then
                        BAErr = BAErr & "," & rs!ID
                    End If
                Case "M"
                    If String(rs![test m], "I") <> x(1) Then
                        MErr = MErr & "," & rs!ID
                    End If
                Case "I"
                    If String(rs![test i], "I") <> x(1) Then
                        IErr = IErr & "," & rs!ID
                    End If
           End Select
        Next

        rs.MoveNext

    Loop


   ''Tidy up
   rs.Close
   Set rs = Nothing
   cn.Close
   Set cn = Nothing

   If BAErr <> "" Then
      MsgBox Mid(BAErr, 2), , "B Errors"
   End If

   If MErr <> "" Then
      MsgBox Mid(MErr, 2), , "M Errors"
   End If

   If IErr <> "" Then
      MsgBox Mid(IErr, 2), , "I Errors"
   End If

End Sub