背景资料: - 有近7000名个人,并且有一两次或三次测试的数据。
每个人都进行了第一次测试(我们称之为测试M )。一些参加过测试M的人也参加了测试I ,而参加测试I的一些人也参加了测试B 。
对于前两个测试(M和I),学生可以获得成绩I,II或III 。根据成绩,他们被授予积分 - 一年级为3,二级为二级,三级为一级。
最后一个测试B只是一个没有成绩的通过或失败结果。通过此测试的人获得1分,没有失败的分数。 (实际上,等级被授予,但所有等级都给予共同的1分)。
业余爱好者已输入数据以在Excel文件中表示这些学生及其成绩。问题是,这个人做了最糟糕的事情 - 他已经开发了自己的符号并在一个单元格中输入了所有测试信息 - 并让我的生活变得地狱。
该文件最初有两个文本列,一个用于个人ID,另一个用于测试信息,如果可以调用它。
alt text http://i48.tinypic.com/5tv0bl.png我知道这太可怕了,我很痛苦。在图像中,如果您看到“M-II-2 I-III-1”,则表示该人在测试M中获得2分,在测试I中获得1分,获得1分。有些只进行了一次测试,有的只有两次,有的是三次。
当文件来找我处理和分析学生的表现时,我发回了指示,插入3个额外的栏目,只有三个考试的成绩。该文件现在看起来如下。 C和D列分别代表1,2和3的I,II和III级。 C栏用于测试M,D栏用于测试I.如果个人通过了测试B,则栏E表示BA(B已实现!)。
alt text http://i50.tinypic.com/16c0yvr.png
现在你已经掌握了上述信息,让我们来解决问题。我不相信这一点,想要检查B列中的数据是否与C,D和E列中的数据匹配。
也就是说,我想检查B列中的字符串,找出C,D和E列中的数字是否正确。
非常感谢所有帮助。
P.S。 - 我已经通过ODBC将它导出到MySQL,这就是你看到那些NULL的原因。我也尝试在MySQL中这样做,并且真的会接受MySQL或Excel解决方案,我没有偏好。
答案 0 :(得分:0)
从原始数据创建平面文件:
Sub GetData()
Dim cn As Object
Dim rs As Object
Dim strFile As String
Dim strCon As String
Dim strSQL As String
Dim s As String, t As Variant, x As Variant
Dim i As Integer, j As Integer, k As Integer
''This is not the best way to refer to the workbook
''you want, but it is very conveient for notes
''It is probably best to use the name of the workbook.
strFile = ActiveWorkbook.FullName
''Note that if HDR=No, F1,F2 etc are used for column names,
''if HDR=Yes, the names in the first row of the range
''can be used.
''This is the Jet 4 connection string, you can get more
''here : http://www.connectionstrings.com/excel
strCon = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & strFile _
& ";Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1"";"
''Late binding, so no reference is needed
Set cn = CreateObject("ADODB.Connection")
Set rs = CreateObject("ADODB.Recordset")
cn.Open strCon
strSQL = "SELECT * " _
& "FROM [Sheet1$] "
''Open the recordset for more processing
''Cursor Type: 3, adOpenStatic
''Lock Type: 3, adLockOptimistic
''Not everything can be done with every cursor type and
''lock type. See http://www.w3schools.com/ado/met_rs_open.asp
rs.Open strSQL, cn, 3, 3
''Pick a suitable empty worksheet for the results
With Worksheets("Sheet2")
''Fill headers into the first row of the worksheet
.Cells(1, 1) = "ID"
.Cells(1, 2) = "Exam"
.Cells(1, 3) = "Grade"
.Cells(1, 4) = "Points"
''Working with the recordset ...
''Counter for Fields/Columns in Recordset and worksheet
''Row one is used with titles, so ...
i = 1
Do While Not rs.EOF
''Store the ID to a string (if it is a long,
''change the type) ...
s = rs!ID
t = Split(rs!testinfo, " ")
For j = 0 To UBound(t)
''(Counter)
i = i + 1
.Cells(i, 1) = s
x = Split(t(j), "-")
For k = 0 To UBound(x)
If t(j) = "BA-1" Then
.Cells(i, 2) = "B"
.Cells(i, 3) = "A"
.Cells(i, 4) = 1
Else
.Cells(i, k + 2) = x(k)
End If
Next
Next
''Keep going
rs.MoveNext
Loop
''Finished with the sheet
End With
''Tidy up
rs.Close
Set rs = Nothing
cn.Close
Set cn = Nothing
End Sub
检查额外的列:
Sub CheckData()
Dim cn As Object
Dim rs As Object
Dim strFile As String
Dim strCon As String
Dim strSQL As String
Dim s As String, t As Variant, x As Variant
Dim i As Integer, j As Integer, k As Integer
Dim BAErr, MErr, IErr
strFile = ActiveWorkbook.FullName
strCon = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & strFile _
& ";Extended Properties=""Excel 8.0;HDR=Yes;IMEX=1"";"
Set cn = CreateObject("ADODB.Connection")
Set rs = CreateObject("ADODB.Recordset")
cn.Open strCon
strSQL = "SELECT * " _
& "FROM [Sheet1$] "
rs.Open strSQL, cn, 3, 3
Do While Not rs.EOF
t = Split(rs!testinfo, " ")
For j = 0 To UBound(t)
x = Split(t(j), "-")
Select Case x(0)
Case "BA"
If rs![test b] <> "BA" Then
BAErr = BAErr & "," & rs!ID
End If
Case "M"
If String(rs![test m], "I") <> x(1) Then
MErr = MErr & "," & rs!ID
End If
Case "I"
If String(rs![test i], "I") <> x(1) Then
IErr = IErr & "," & rs!ID
End If
End Select
Next
rs.MoveNext
Loop
''Tidy up
rs.Close
Set rs = Nothing
cn.Close
Set cn = Nothing
If BAErr <> "" Then
MsgBox Mid(BAErr, 2), , "B Errors"
End If
If MErr <> "" Then
MsgBox Mid(MErr, 2), , "M Errors"
End If
If IErr <> "" Then
MsgBox Mid(IErr, 2), , "I Errors"
End If
End Sub