将大型记录集更新为规范化的Access数据库

时间:2011-04-20 22:07:21

标签: sql ms-access vba ms-access-2007 normalization

我正在使用Access 2007来构建规范化数据库,以替换使用几个平面多字段表的数据库。我的问题是我经常得到包含大量更新的Excel工作表,我将这些更新作为表导入,然后针对现有表加入以进行更新。但是,由于我正常化,这将变得更加困难。以下是更新值的VBA代码示例:

function updateBoxCategory(boxID As String, newCategory As String) as long

Dim boxKey As Long
Dim catKey As Long
Dim db As Database
Dim ustr As String

Set db = CurrentDb


boxKey = getKey(db, "boxes", "boxID", boxID)

'exit if box not found'
If boxKey = 0 Then
 Exit Sub
End If

catKey = getKey(db, "categories", "category", newCategory)

'exit if category not found'
If catKey = 0 Then
 Exit Sub
End If

ustr = "update boxes set catKey=" & catKey & " where ID=" & boxKey
db.Execute ustr, dbFailOnError

End Sub

getKey(“dbObject”,“table”,“field”,“value”)返回唯一值的主键。

我担心如果,例如,我必须更新100,000条记录,我将不得不通过每条记录循环此程序的查询,这意味着我将对具有100,000条记录的表运行100,000次选择查询 - 这让我担心性能问题,即使所有内容都已编入索引。

基本上,我的问题是:这段代码是处理规范化数据更新的合适方式吗?

2 个答案:

答案 0 :(得分:1)

在SQL中,我们避开这样的过程代码,而采用基于集合的解决方案。理想的是你在一个SQL语句中告诉优化器你想要实现什么,它(而不是你)决定如何做到最好。

我假设您有一个临时表(可以在Excel中,可以是一个链接表),其中包含真实密钥的列boxIDnewCategory。但是,您不能在Boxes表中使用这些值,因为架构设计中存在一些间接性:您需要使用查找表找到“代理”键值(我建议您考虑修复此“功能”) “在你的设计中,你可以使用真正的键值:)

以下是使用Standard SQL:2003(例如适用于SQL Server 2008)的方法:

MERGE INTO Boxes
   USING (
          SELECT B1.ID AS boxKey, C1.ID AS catKey
            FROM YourStagingTable AS S1
                 INNER JOIN Boxes AS B1
                    ON B1.boxID = S1.boxID
                 INNER JOIN Categories AS C1
                    ON C1.category = S1.NewCategory
         ) AS source (
                      boxKey, catKey
                     )
      ON Boxes.ID = source.boxKey
WHEN MATCHED THEN
   UPDATE
      SET catKey = source.catKey;

这是SQL-92标准中的等价物,它需要使用标量子查询:

UPDATE Boxes
   SET catKey = 
                (
                 SELECT C1.ID AS catKey
                   FROM YourStagingTable AS S1
                         INNER JOIN Boxes AS B1
                            ON B1.boxID = S1.boxID
                         INNER JOIN Categories AS C1
                            ON C1.category = S1.NewCategory
                  WHERE Boxes.ID = B1.ID
                ) 
 WHERE EXISTS (
               SELECT * 
                 FROM YourStagingTable AS S1
                       INNER JOIN Boxes AS B1
                          ON B1.boxID = S1.boxID
                WHERE Boxes.ID = B1.ID
              );

可悲的是,Access(Jet,ACE,等等)即使在入门级也不支持任何现代SQL标准(如果1992年的某些东西确实可以被认为是'现代':)相反,Access坚持使用它的专有权{{ 1}}语法,我从未真正熟悉过。希望以上内容能为您指出正确的Access方向(或者有人可以编辑此答案以添加等效的Access方言......?)

答案 1 :(得分:0)

我不完全确定你在这里做了什么,但是如果你试图将表格中的一行与电子表格中的一行匹配,并将不同于电子表格的值复制到表格中,则需要将你的进近旋转90度。

也就是说,不是为每个ROW运行SQL UPDATE,而是为每个COLUMN运行一个。

这是执行此操作的代码。它假设被比较的两个表具有共享主键,并且它们具有相同的字段名称(尽管您可以编写一个查询,将字段别名化为一个以匹配另一个中的名称):

  Public Function UpdateTableData(ByVal strSourceTable As String, _
       ByVal strTargetTable As String, ByVal strJoinField As String, _
       ByRef db As DAO.Database, Optional ByVal strExcludeFieldsList As String, _
       Optional ByVal strUpdatedBy As String = "Auto Update", _
       Optional strAdditionalCriteria As String) As Boolean
    Dim strUpdate As String
    Dim rsFields As DAO.Recordset
    Dim fld As DAO.Field
    Dim strFieldName As String
    Dim strNZValue As String
    Dim strSet As String
    Dim strWhere As String
    Dim STR_QUOTE = """"

    strUpdate = "UPDATE " & strTargetTable & " INNER JOIN " & strSourceTable _
       & " ON " & strTargetTable & "." & strJoinField & " = " _
       & strSourceTable & "." & strJoinField
    ' if the fields don't have the same names in both tables,
    '   create a query that aliases the fields to have the names of the
    '   target table
    ' if the source table is in a different database and you don't
    '   want to create a linked table, create a query and specify
    '   the external database as the source of the table
    ' alternatively, for strTargetTable, supply a SQL string with
    '   the external connect string
    Set rsFields = db.OpenRecordset(strSourceTable)
    For Each fld In rsFields.Fields
      strFieldName = fld.Name
      If strFieldName <> strJoinField Or (InStr(", " & strExcludeFieldsList _
           & ",", strFieldName & ",") <> 0) Then
         Select Case fld.Type
           Case dbText, dbMemo
             strNZValue = "''"
           Case Else
             strNZValue = "0"
         End Select
         strSet = " SET " & strTargetTable & "." & strFieldName _
           & " = varZLSToNull(" & strSourceTable & "." & strFieldName & ")"
         strSet = strSet & ", " & strTargetTable & ".Updated = #" & Date & "#"
         strSet = strSet & ", " & strTargetTable & ".UpdatedBy = " _
           & STR_QUOTE & strUpdatedBy & STR_QUOTE
         strWhere = " WHERE Nz(" & strTargetTable & "." & strFieldName _
           & ", " & strNZValue & ") <> Nz(" & strSourceTable & "." _
           & strFieldName & ", " & strNZValue & ")"
         If db.TableDefs(strTargetTable).Fields(fld.Name).Required Then
            strWhere = strWhere & " AND " & strSourceTable & "." _
              & strFieldName & " Is Not Null"
         End If
         If Len(strAdditionalCriteria) > 0 Then
            strWhere = strWhere & " AND " & strAdditionalCriteria
         End If
         Debug.Print strUpdate & strSet & strWhere
         Debug.Print SQLRun(strUpdate & strSet & strWhere, dbLocal) & " " _
           & strFieldName & " updated."
      End If
    Next fld
    Debug.Print dbLocal.OpenRecordset("SELECT COUNT(*) FROM " _
      & strTargetTable & " WHERE Updated=#" & Date _
      & "# AND UpdatedBy=" & STR_QUOTE & strUpdatedBy & STR_QUOTE)(0) _
      & " total records updated in " & strTargetTable
    rsFields.Close
    Set rsFields = Nothing
    UpdateTableData = True
  End Function

我已经使用该代码的变体已经十多年了,它比逐行执行它更快更高效。

请注意,有一些假设是硬连线的(就像每个表都有Updated和UpdatedBy字段这样的事实)。但它应该让你开始。