我正在使用Access 2007来构建规范化数据库,以替换使用几个平面多字段表的数据库。我的问题是我经常得到包含大量更新的Excel工作表,我将这些更新作为表导入,然后针对现有表加入以进行更新。但是,由于我正常化,这将变得更加困难。以下是更新值的VBA代码示例:
function updateBoxCategory(boxID As String, newCategory As String) as long
Dim boxKey As Long
Dim catKey As Long
Dim db As Database
Dim ustr As String
Set db = CurrentDb
boxKey = getKey(db, "boxes", "boxID", boxID)
'exit if box not found'
If boxKey = 0 Then
Exit Sub
End If
catKey = getKey(db, "categories", "category", newCategory)
'exit if category not found'
If catKey = 0 Then
Exit Sub
End If
ustr = "update boxes set catKey=" & catKey & " where ID=" & boxKey
db.Execute ustr, dbFailOnError
End Sub
getKey(“dbObject”,“table”,“field”,“value”)返回唯一值的主键。
我担心如果,例如,我必须更新100,000条记录,我将不得不通过每条记录循环此程序的查询,这意味着我将对具有100,000条记录的表运行100,000次选择查询 - 这让我担心性能问题,即使所有内容都已编入索引。
基本上,我的问题是:这段代码是处理规范化数据更新的合适方式吗?
答案 0 :(得分:1)
在SQL中,我们避开这样的过程代码,而采用基于集合的解决方案。理想的是你在一个SQL语句中告诉优化器你想要实现什么,它(而不是你)决定如何做到最好。
我假设您有一个临时表(可以在Excel中,可以是一个链接表),其中包含真实密钥的列boxID
和newCategory
。但是,您不能在Boxes
表中使用这些值,因为架构设计中存在一些间接性:您需要使用查找表找到“代理”键值(我建议您考虑修复此“功能”) “在你的设计中,你可以使用真正的键值:)
以下是使用Standard SQL:2003(例如适用于SQL Server 2008)的方法:
MERGE INTO Boxes
USING (
SELECT B1.ID AS boxKey, C1.ID AS catKey
FROM YourStagingTable AS S1
INNER JOIN Boxes AS B1
ON B1.boxID = S1.boxID
INNER JOIN Categories AS C1
ON C1.category = S1.NewCategory
) AS source (
boxKey, catKey
)
ON Boxes.ID = source.boxKey
WHEN MATCHED THEN
UPDATE
SET catKey = source.catKey;
这是SQL-92标准中的等价物,它需要使用标量子查询:
UPDATE Boxes
SET catKey =
(
SELECT C1.ID AS catKey
FROM YourStagingTable AS S1
INNER JOIN Boxes AS B1
ON B1.boxID = S1.boxID
INNER JOIN Categories AS C1
ON C1.category = S1.NewCategory
WHERE Boxes.ID = B1.ID
)
WHERE EXISTS (
SELECT *
FROM YourStagingTable AS S1
INNER JOIN Boxes AS B1
ON B1.boxID = S1.boxID
WHERE Boxes.ID = B1.ID
);
可悲的是,Access(Jet,ACE,等等)即使在入门级也不支持任何现代SQL标准(如果1992年的某些东西确实可以被认为是'现代':)相反,Access坚持使用它的专有权{{ 1}}语法,我从未真正熟悉过。希望以上内容能为您指出正确的Access方向(或者有人可以编辑此答案以添加等效的Access方言......?)
答案 1 :(得分:0)
我不完全确定你在这里做了什么,但是如果你试图将表格中的一行与电子表格中的一行匹配,并将不同于电子表格的值复制到表格中,则需要将你的进近旋转90度。
也就是说,不是为每个ROW运行SQL UPDATE,而是为每个COLUMN运行一个。
这是执行此操作的代码。它假设被比较的两个表具有共享主键,并且它们具有相同的字段名称(尽管您可以编写一个查询,将字段别名化为一个以匹配另一个中的名称):
Public Function UpdateTableData(ByVal strSourceTable As String, _
ByVal strTargetTable As String, ByVal strJoinField As String, _
ByRef db As DAO.Database, Optional ByVal strExcludeFieldsList As String, _
Optional ByVal strUpdatedBy As String = "Auto Update", _
Optional strAdditionalCriteria As String) As Boolean
Dim strUpdate As String
Dim rsFields As DAO.Recordset
Dim fld As DAO.Field
Dim strFieldName As String
Dim strNZValue As String
Dim strSet As String
Dim strWhere As String
Dim STR_QUOTE = """"
strUpdate = "UPDATE " & strTargetTable & " INNER JOIN " & strSourceTable _
& " ON " & strTargetTable & "." & strJoinField & " = " _
& strSourceTable & "." & strJoinField
' if the fields don't have the same names in both tables,
' create a query that aliases the fields to have the names of the
' target table
' if the source table is in a different database and you don't
' want to create a linked table, create a query and specify
' the external database as the source of the table
' alternatively, for strTargetTable, supply a SQL string with
' the external connect string
Set rsFields = db.OpenRecordset(strSourceTable)
For Each fld In rsFields.Fields
strFieldName = fld.Name
If strFieldName <> strJoinField Or (InStr(", " & strExcludeFieldsList _
& ",", strFieldName & ",") <> 0) Then
Select Case fld.Type
Case dbText, dbMemo
strNZValue = "''"
Case Else
strNZValue = "0"
End Select
strSet = " SET " & strTargetTable & "." & strFieldName _
& " = varZLSToNull(" & strSourceTable & "." & strFieldName & ")"
strSet = strSet & ", " & strTargetTable & ".Updated = #" & Date & "#"
strSet = strSet & ", " & strTargetTable & ".UpdatedBy = " _
& STR_QUOTE & strUpdatedBy & STR_QUOTE
strWhere = " WHERE Nz(" & strTargetTable & "." & strFieldName _
& ", " & strNZValue & ") <> Nz(" & strSourceTable & "." _
& strFieldName & ", " & strNZValue & ")"
If db.TableDefs(strTargetTable).Fields(fld.Name).Required Then
strWhere = strWhere & " AND " & strSourceTable & "." _
& strFieldName & " Is Not Null"
End If
If Len(strAdditionalCriteria) > 0 Then
strWhere = strWhere & " AND " & strAdditionalCriteria
End If
Debug.Print strUpdate & strSet & strWhere
Debug.Print SQLRun(strUpdate & strSet & strWhere, dbLocal) & " " _
& strFieldName & " updated."
End If
Next fld
Debug.Print dbLocal.OpenRecordset("SELECT COUNT(*) FROM " _
& strTargetTable & " WHERE Updated=#" & Date _
& "# AND UpdatedBy=" & STR_QUOTE & strUpdatedBy & STR_QUOTE)(0) _
& " total records updated in " & strTargetTable
rsFields.Close
Set rsFields = Nothing
UpdateTableData = True
End Function
我已经使用该代码的变体已经十多年了,它比逐行执行它更快更高效。
请注意,有一些假设是硬连线的(就像每个表都有Updated和UpdatedBy字段这样的事实)。但它应该让你开始。