我们正在组合在不同阶段设置的某些单独的公司数据库,它们都具有大致相同的数据,但不具有相同的顺序/ ID。
这里是3个数据库的伪覆盖,每个数据库有2个示例表:
+------------+-----------+---------------+
| DATABASE 1 | | |
+------------+-----------+---------------+
| fruit_id | name | |
| 1 | orange | |
| 2 | apple | |
| 3 | banana | |
| | | |
| sales_id | fruit_ids | |
| 1924 | 2,3 | apple,banana |
| 1925 | 1,3 | orange,apple |
| | | |
| DATABASE 2 | | |
| fruit_id | name | |
| 1 | apple | |
| 2 | orange | |
| 3 | banana | |
| | | |
| sales_id | fruit_ids | |
| 1924 | 2,3 | orange,banana |
| 1925 | 1,3 | apple,banana |
| | | |
| DATABASE 3 | | |
| fruit_id | name | |
| 1 | banana | |
| 2 | apple | |
| 3 | orange | |
| | | |
| sales_id | fruit_ids | |
| 1950 | 2,3 | apple,orange |
| 1951 | 1,3 | banana,orange |
+------------+-----------+---------------+
您会看到某些数据库sales_id实际上是重复的,甚至fruit_ids也与每个表中的不同项目相关。
我们正在使用的真实数据库会在很多地方的数据库周围点缀fruit_ids,而且有些数据库包含1m +行,因此手动MySQL查询有点不合适。
我们需要组合的实际数据库具有比上述伪表更高的复杂性,但我正在寻找是否有任何逻辑/工具/软件可以帮助将MySQL数据库与这种类似的数据相结合?
答案 0 :(得分:1)
到目前为止,答案并不多。这是一些初步的努力,以显示在具有相同格式的一些数据库之间生成公共密钥:
// Source databases or companies
// -----------------------------
$corp = array();
$corp[0] = 'XYZ Corp';
$corp[1] = 'ABC Corp';
$corp[2] = '123 Corp';
// Source fruit lists
// ------------------
$fruit = array();
$fruit[0]['orange'] = 1;
$fruit[0]['apple'] = 2;
$fruit[0]['banana'] = 3;
$fruit[0]['kiwi'] = 4;
$fruit[1]['apple'] = 1;
$fruit[1]['orange'] = 2;
$fruit[1]['banana'] = 3;
$fruit[1]['pear'] = 4;
$fruit[2]['banana'] = 1;
$fruit[2]['apple'] = 2;
$fruit[2]['orange'] = 3;
$fruit[2]['grape'] = 4;
// Master fruit list
// -----------------
echo "Generating common fruit key list...<br>\n";
// Generate common keys in sorted item order
// -----------------------------------------
$common = array();
foreach ($corp as $corpid => $name)
{
echo "<br>\n";
echo "... examining fruit keys for " . $corp[$corpid] . "<br>\n";
foreach ($fruit[$corpid] as $name => $id)
{
$common[$name] = 0;
echo "... ... fruit $name has key $id<br>\n";
}
}
ksort($common);
$i = 0;
foreach ($common as $name => $dummy )
$common[$name] = ++$i;
echo "<br>\n";
print_r($common);
echo "<br><br>\n";
// Demonstrate index conversions
// -----------------------------
echo "Demonstrating conversions to common indexes per company...<br>\n";
foreach ($corp as $corpid => $name)
{
echo "<br>\n";
echo "... examining key conversions for " . $corp[$corpid] . "<br>\n";
foreach ($fruit[$corpid] as $name => $id)
{
$new = $common[$name];
$old = $id;
echo "... ... fruit $name key changes from $old to $new<br>\n";
}
}
您将获得类似于以下内容的结果:
Generating common fruit key list...
... examining fruit keys for XYZ Corp
... ... fruit orange has key 1
... ... fruit apple has key 2
... ... fruit banana has key 3
... ... fruit kiwi has key 4
... examining fruit keys for ABC Corp
... ... fruit apple has key 1
... ... fruit orange has key 2
... ... fruit banana has key 3
... ... fruit pear has key 4
... examining fruit keys for 123 Corp
... ... fruit banana has key 1
... ... fruit apple has key 2
... ... fruit orange has key 3
... ... fruit grape has key 4
Array ( [apple] => 1 [banana] => 2 [grape] => 3 [kiwi] => 4 [orange] => 5 [pear] => 6 )
Demonstrating conversions to common indexes per company...
... examining key conversions for XYZ Corp
... ... fruit orange key changes from 1 to 5
... ... fruit apple key changes from 2 to 1
... ... fruit banana key changes from 3 to 2
... ... fruit kiwi key changes from 4 to 4
... examining key conversions for ABC Corp
... ... fruit apple key changes from 1 to 1
... ... fruit orange key changes from 2 to 5
... ... fruit banana key changes from 3 to 2
... ... fruit pear key changes from 4 to 6
... examining key conversions for 123 Corp
... ... fruit banana key changes from 1 to 2
... ... fruit apple key changes from 2 to 1
... ... fruit orange key changes from 3 to 5
... ... fruit grape key changes from 4 to 3
需要考虑的一些问题:
这应该为思考这个提供一个起点。将所有源数据库处理到新数据库然后验证合并结果可能需要多长时间。