应用错误收集

Best method updating sandbox tables with production tables/views

时间：2018-03-07 13:26:04

标签： sql-server development-environment production-environment insert-update

Using SQL, it is taking over 4 hours every evening to pull over all the data from the twelve Production database tables or views needed for our Sandbox database. There has to be a significantly more efficient and effective manner to get this data into our Sandbox.

Currently, I'm creating a UID (Unique ID) by concatenating the views Primary Keys and system date fields.

The UID is used in two steps:

Step 1. INSERT INTO Sandbox WHERE UID IS NULL and only Looking back the Last 30 Days based on the System Date (using Left Join the Production Table/View.UID to the Existing Sandbox Table/View.UID)

Step 2. UPDATE Sandbox Where Production.UID = Sandbox.UID (using an Inner Join of the Production Table/View.UID to the Existing Sandbox Table/View.UID)

I've cut the 4 hour time down to 2 hours, but it feels like this process I've created is missing a (big) step.

How can I cut this time down? Should I put a 30 day filter on my UPDATE statement as well?

2 个答案:

答案 0 :(得分：0)

假设您没有将数十亿行移动到开发环境中，我只想创建一个简单的ETL策略来截断开发环境并从生产中完全加载。如果您不想要完整数据集，请为ETL的源查询添加过滤器。只要确保这对数据的完整性没有任何影响。

如果您的数据数十亿，您可能已经有了企业存储解决方案。其中许多人可以处理将数据文件快照到另一个位置。这种方法有一些安全方面，你也需要考虑。

答案 1 :(得分：0)

我找到了一个分为两部分的答案。它可能不是最好的解决方案，但它似乎暂时正在发挥作用。

我可以使用主键作为生产箱数据库表中的UID（大部分）。使用30-90天过滤器更新它们
这些视图有点棘手，因为它们结合了两个精确的表并且具有重复的主键。因此，我创建了自己的uid连接多个主键字段并使用30-90天过滤器进行更新。

之前的过程最多需要4个多小时才能完成。新流程在一小时内完成，目前似乎正在运作。