我有一个由大容量数据插入填充的转储表,我想按分类将它们隔离在其他表上。
这是我的转储表,其中包含从文本文件中提取的数据。
==========================DUMP===============================
| Employee Name | Company | Family Tree Name | Relationship |
=============================================================
| Bryan Fury | Guugle | Jenny Fury | Wife |
| | | Peter Fury | Son |
| | | Mary Fury | Daughter |
| Paul Pheonix | Soony | Linda Phoenix | Wife |
| | | Peter Phoenix | Son |
| | | John Phoenix | Son |
| Gwen Zamora | Aple | Sebastian Zamora | Husband |
| | | Ryan Zamora | Son |
=============================================================
我想将它们分成两个这样的标识符表
================EMPLOYEE===============
| Employee Name | Company | Tagging |
=======================================
| Bryan Fury | Guugle | Family 1 |
| Paul Pheonix | Soony | Family 2 |
| Gwen Zamora | Aple | Family 3 |
=======================================
==============FAMILY TREE===================
| Name | Relationship| Tagging |
============================================
| Jenny Fury | Wife | Family 1 |
| Peter Fury | Son | Family 1 |
| Mary Fury | Daughter | Family 1 |
| Linda Phoenix | Wife | Family 2 |
| Peter Phoenix | Son | Family 2 |
| John Phoenix | Son | Family 2 |
| Sebastian Zamora| Husband | Family 3 |
| Ryan Zamora | Son | Family 3 |
============================================
答案 0 :(得分:0)
假设转储表中有一列可用于按插入顺序获取记录,这是一种处理方法:
try
{
$conn = new PDO("mysql:host=".DATABASE_HOST.";dbname=".DATABASE_NAME.";charset=UTF8", DATABASE_USER, DATABASE_PASS);
$conn->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
}
catch(PDOException $e)
{
trigger_error("Can not connect to database: ".$e->getMessage(), E_USER_ERROR);
die;
}
$stm_employee = $conn->prepare('INSERT INTO employee(employee_name, company) VALUES(:emp, :comp)');
$stm_tree = $conn->prepare('INSERT INTO family_tree(name, relationship, tagging) VALUES(:name, :relation, :tag)');
$res = $conn->query('SELECT employee_name, company, family_tree, relationship FROM dump ORDER BY id');
$old_employee = '';
$old_tag = 0;
while($row = $res->fetch(PDO::FETCH_ASSOC))
{
if($row['employee_name'] != $old_employee)
{
$stm_employee->execute(array(
'emp' => $row['employee_name'],
'comp' => $row['company']
));
$old_tag = $conn->lastInsertId();
$old_employee = $row['employee_name'];
$stm_tree->execute(array(
'name' => $row['family_tree'],
'relation' => $row['relationship'],
'tag' => $old_tag
));
}
else
{
$stm_tree->execute(array(
'name' => $row['family_tree'],
'relation' => $row['relationship'],
'tag' => $old_tag
));
}
}
$conn->query('TRUNCATE TABLE dump');
答案 1 :(得分:0)
经典的“规范化”。
假设这是需要的两个表:
CREATE TABLE Employee (
family_id INT UNSIGNED AUTO_INCREMENT,
name ...,
company ...,
PRIMARY KEY(id)
) ENGINE=InnoDB;
CREATE TABLE FamilyTree (
id INT UNSIGNED AUTO_INCREMENT,
family_id INT UNSIGNED,
name ...,
relationship ...,
PRIMARY KEY(id)
) ENGINE=InnoDB;
下面是填充它们的SQL:
-- Create ids for each "family" (`id` will be automatically set):
INSERT INTO Employee (name, company)
SELECT DISTINCT employee_name, company
FROM Dump;
-- Build the other table:
INSERT INTO FamilyTree (name, relationship, family_id)
SELECT d.family_tree_name, d.relationship,
e.family_id
FROM Employee AS e
JOIN Dump AS d ON d.employee_name = e.name
AND d.company = e.company;
这需要更少的输入,对于使用SQL来说是一个很好的教训,而不是用编程语言来繁琐地编写类似SQL的动作。
如果有裙带关系,您会有问题。