如何在Rails中同步外部数据库中的数据?

时间:2015-02-27 20:22:07

标签: ruby-on-rails sql-server rails-activerecord

我正在将另一个数据库中的信息导入我的Rails数据库。目前我已查询数据库以从表中获取所有记录,然后创建模型的新对象并分配值。我想要检测记录是否已更新,如果有,那么我将更新记录。记录大约有40个属性。我可以使用什么方法查询数据库以查看记录是否已更改?目前我使用以下方法,但似乎很慢。

SELECT A.attribute1, A.attribute2, A.attribute3, ...
  FROM external.dbo.myobject A
 INNER JOIN internal.dbo.myobject B
    ON A.key = B.key
 WHERE (A.attribute1 <> B.attribute1 OR
        A.attribute2 <> B.attribute2 OR
        A.attribute3 <> B.attribute3 OR
        ...)

1 个答案:

答案 0 :(得分:1)

1)要检测更改,最简单的解决方案是使用EXCEPT

以上查询/陈述

INSERT INTO #changes (...)
SELECT ... FROM external.dbo.object
EXCEPT
SELECT ... FROM internal.dbo.object

将来自#changes的所有行external.dbo.object插入internal.dbo.object中不同或不存在的行。{/ p>

对于同步,我会使用MERGE语句(参见上面的例子):

MERGE   dbo.InternalObj AS i
USING   #changes AS e ON i.ID = e.ID
WHEN MATCHED 
    THEN UPDATE SET ... 
WHEN NOT MATCHED
    THEN INSERT ... -- This clause INSERT new rows

2)检测更改的另一个选项是使用ROWVERSION数据类型,这是在插入行或更新时生成的二进制值自动

示例:

CREATE TABLE dbo.InternalObj (
    ID      INT PRIMARY KEY,
    ColA    VARCHAR(100) NOT NULL,
    rw      BINARY(8) NOT NULL 
);
GO

CREATE TABLE dbo.ExternalObj (
    ID      INT IDENTITY(1,1) PRIMARY KEY,
    ColA    VARCHAR(100) NOT NULL,
    rw      ROWVERSION NOT NULL -- When ColA values are changed, SQL Server will automaticcaly update [rw]
);
GO
INSERT dbo.ExternalObj (ColA) VALUES ('A')
INSERT dbo.ExternalObj (ColA) VALUES ('B')
GO

-- First test & sync
MERGE   dbo.InternalObj AS i
USING   dbo.ExternalObj AS e ON i.ID = e.ID
WHEN MATCHED AND i.rw <> e.rw -- Same [ID] but differet [rw] values
    THEN
    UPDATE -- This clause update changed rows
    SET i.ColA  = e.ColA,
        i.rw    = e.rw 
WHEN NOT MATCHED
    THEN 
    INSERT  (ID, ColA, rw) -- This clause INSERT new rows
    VALUES  (e.ID, e.ColA, e.rw);
GO
SELECT * FROM dbo.InternalObj;
/*
ID          ColA rw
----------- ---- ------------------
1           A    0x0000000000000FA6
2           B    0x0000000000000FA7
*/
GO

-- Second test & sync
INSERT dbo.ExternalObj (ColA) VALUES ('C')
UPDATE  dbo.ExternalObj
SET     ColA = ColA + '#'
WHERE   ID = 2
GO
MERGE   dbo.InternalObj AS i
USING   dbo.ExternalObj AS e ON i.ID = e.ID
WHEN MATCHED AND i.rw <> e.rw -- Same [ID] but differet [rw] values
    THEN
    UPDATE -- This clause update changed rows
    SET i.ColA  = e.ColA,
        i.rw    = e.rw 
WHEN NOT MATCHED 
    THEN 
    INSERT  (ID, ColA, rw) -- This clause INSERT new rows
    VALUES  (e.ID, e.ColA, e.rw);
GO
SELECT * FROM dbo.InternalObj;
/*
ID          ColA rw
----------- ---- ------------------
1           A    0x0000000000000FA6
2           B#   0x0000000000000FA9
3           C    0x0000000000000FA8
*/
GO

如果我需要检测任何列中的更改,我会使用[ROWVERSION]。

注意:像UPDATE ... SET ColA = <the same value>之类的简单内容会更改受影响行的[rw]值。

3)第三种解决方案使用BINARY_CHECKSUM函数为每一行生成校验和值。通过比较每个ID /行的这些校验和值,我们可以检测到变化

ALTER TABLE dbo.InternalObj DROP COLUMN [rw]
ALTER TABLE dbo.ExternalObj DROP COLUMN [rw]
GO

-- Test / sync
;WITH CteTarget
AS (
    SELECT *, BINARY_CHECKSUM(*) AS CRC
    FROM (
    SELECT  ID, ColA -- Only selected columns
    FROM    dbo.InternalObj 
    ) x
), CteSource
AS (
    SELECT *, BINARY_CHECKSUM(*) AS CRC
    FROM (
    SELECT  ID, ColA -- Only selected columns
    FROM    dbo.ExternalObj 
    ) y
)
MERGE   CteTarget i
USING   CteSource e ON i.ID = e.ID 
WHEN MATCHED AND EXISTS(SELECT i.CRC EXCEPT SELECT e.CRC) -- Same [ID] but differet [CRC] values 
    THEN
    UPDATE -- This clause update changed rows
    SET i.ColA  = e.ColA
WHEN NOT MATCHED 
    THEN 
    INSERT  (ID, ColA) -- This clause INSERT new rows
    VALUES  (e.ID, e.ColA);
GO
SELECT * FROM dbo.InternalObj;
/*
ID          ColA
----------- ----
1           A
2           B
*/
GO

注意:BINARY_CHECKSUM函数可能会生成哈希冲突(具有不同值的行=&gt;相同的校验和=&gt;未检测到更改)。