如何透视MySQL实体 - 属性 - 值模式

时间:2009-03-16 09:58:21

标签: mysql database-design pivot entity-attribute-value

我需要设计存储文件所有元数据的表(即文件名,作者,标题,创建日期)和自定义元数据(用户已添加到文件中,例如CustUseBy,CustSendBy)。无法预先设置自定义元数据字段的数量。实际上,确定在文件中添加了什么和多少自定义标记的唯一方法是检查表中存在的内容。

为了存储它,我创建了一个基表(包含文件的所有公共元数据),一个Attributes表(包含可以在文件上设置的附加,可选属性)和一个FileAttributes表(为文件的属性赋值。)

CREAT TABLE FileBase (
    id VARCHAR(32) PRIMARY KEY,
    name VARCHAR(255) UNIQUE NOT NULL,
    title VARCHAR(255),
    author VARCHAR(255),
    created DATETIME NOT NULL,
) Engine=InnoDB;

CREATE TABLE Attributes (
    id VARCHAR(32) PRIMARY KEY,
    name VARCHAR(255) NOT NULL,
    type VARCHAR(255) NOT NULL
) Engine=InnoDB;

CREATE TABLE FileAttributes (
    sNo INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
    fileId VARCHAR(32) NOT NULL,
    attributeId VARCHAR(32) NOT NULL,
    attributeValue VARCHAR(255) NOT NULL,
    FOREIGN KEY fileId REFERENCES FileBase (id),
    FOREIGN KEY attributeId REFERENCES Attributes (id)
 ) Engine=InnoDB;

示例数据:

INSERT INTO FileBase
(id,      title,  author,  name,        created)
  VALUES
('F001', 'Dox',   'vinay', 'story.dox', '2009/01/02 15:04:05'),
('F002', 'Excel', 'Ajay',  'data.xls',  '2009/02/03 01:02:03');

INSERT INTO Attributes
(id,      name,            type)
  VALUES
('A001', 'CustomeAttt1',  'Varchar(40)'),
('A002', 'CustomUseDate', 'Datetime');

INSERT INTO FileAttributes 
(fileId, attributeId, attributeValue)
  VALUES
('F001', 'A001',      'Akash'),
('F001', 'A002',      '2009/03/02');

现在问题是我想以这样的方式显示数据:

FileId, Title, Author, CustomAttri1, CustomAttr2, ...
F001    Dox    vinay   Akash         2009/03/02   ...
F002    Excel  Ajay     

什么查询会生成此结果?

7 个答案:

答案 0 :(得分:21)

这个问题提到了MySQL,实际上这个DBMS对这类问题有一个特殊的功能:GROUP_CONCAT(expr)。看看MySQL reference manual on group-by-functions。该功能已在MySQL 4.1版中添加。您将在查询中使用GROUP BY FileID

我不确定你希望结果如何。如果你想为每个项目列出每个属性(即使没有设置),它将更难。但是,这是我对如何做的建议:

SELECT bt.FileID, Title, Author, 
 GROUP_CONCAT(
  CONCAT_WS(':', at.AttributeName, at.AttributeType, avt.AttributeValue) 
  ORDER BY at.AttributeName SEPARATOR ', ') 
FROM BaseTable bt JOIN AttributeValueTable avt ON avt.FileID=bt.FileID 
 JOIN AttributeTable at ON avt.AttributeId=at.AttributeId 
GROUP BY bt.FileID;

这为您提供了相同顺序的所有属性,这可能很有用。输出将如下所示:

'F001', 'Dox', 'vinay', 'CustomAttr1:varchar(40):Akash, CustomUseDate:Datetime:2009/03/02'

这样,您只需要一个数据库查询,并且输出很容易解析。如果要将属性存储为数据库中的实际日期时间等,则需要使用动态SQL,但我会从中保持清晰并将值存储在varchars中。

答案 1 :(得分:9)

此类查询的一般形式为

SELECT file.*,
   attr1.value AS 'Attribute 1 Name', 
   attr2.value AS 'Attribute 2 Name', 
   ...
FROM
   file 
   LEFT JOIN attr AS attr1 
      ON(file.FileId=attr1.FileId and attr1.AttributeId=1)
   LEFT JOIN attr AS attr2 
      ON(file.FileId=attr2.FileId and attr2.AttributeId=2)
   ...

因此,您需要根据所需的属性动态构建查询。在php-ish伪代码中

$cols="file";
$joins="";

$rows=$db->GetAll("select * from Attributes");
foreach($rows as $idx=>$row)
{
   $alias="attr{$idx}";
   $cols.=", {$alias}.value as '".mysql_escape_string($row['AttributeName'])."'";   
   $joins.="LEFT JOIN attr as {$alias} on ".
       "(file.FileId={$alias}.FileId and ".
       "{$alias}.AttributeId={$row['AttributeId']}) ";
}

 $pivotsql="select $cols from file $joins";

答案 2 :(得分:8)

如果您正在寻找比组结果更可用(和可加入)的内容,请尝试以下解决方案。我已经创建了一些与你的例子非常相似的表格,以使其有意义。

这适用于:

  • 您需要纯SQL解决方案(无代码,无循环)
  • 您有一组可预测的属性(例如,不是动态的)
  • 您可以在需要添加新属性类型时更新查询
  • 您希望结果可以加入,联合或嵌套为子选择

表A(文件)

FileID, Title, Author, CreatedOn

表B(属性)

AttrID, AttrName, AttrType [not sure how you use type...]

表C(Files_Attributes)

FileID, AttrID, AttrValue

传统查询会拉出许多冗余行:

SELECT * FROM 
Files F 
LEFT JOIN Files_Attributes FA USING (FileID)
LEFT JOIN Attributes A USING (AttributeID);
AttrID  FileID  Title           Author  CreatedOn   AttrValue   AttrName    AttrType
50      1       TestFile        Joe     2011-01-01  true        ReadOnly        bool
60      1       TestFile        Joe     2011-01-01  xls         FileFormat      text
70      1       TestFile        Joe     2011-01-01  false       Private         bool
80      1       TestFile        Joe     2011-01-01  2011-10-03  LastModified    date
60      2       LongNovel       Mary    2011-02-01  json        FileFormat      text
80      2       LongNovel       Mary    2011-02-01  2011-10-04  LastModified    date
70      2       LongNovel       Mary    2011-02-01  true        Private         bool
50      2       LongNovel       Mary    2011-02-01  true        ReadOnly        bool
50      3       ShortStory      Susan   2011-03-01  false       ReadOnly        bool
60      3       ShortStory      Susan   2011-03-01  ascii       FileFormat      text
70      3       ShortStory      Susan   2011-03-01  false       Private         bool
80      3       ShortStory      Susan   2011-03-01  2011-10-01  LastModified    date
50      4       ProfitLoss      Bill    2011-04-01  false       ReadOnly        bool
70      4       ProfitLoss      Bill    2011-04-01  true        Private         bool
80      4       ProfitLoss      Bill    2011-04-01  2011-10-02  LastModified    date
60      4       ProfitLoss      Bill    2011-04-01  text        FileFormat      text
50      5       MonthlyBudget   George  2011-05-01  false       ReadOnly        bool
60      5       MonthlyBudget   George  2011-05-01  binary      FileFormat      text
70      5       MonthlyBudget   George  2011-05-01  false       Private         bool
80      5       MonthlyBudget   George  2011-05-01  2011-10-20  LastModified    date

这个合并查询(使用MAX的方法)可以合并行:

SELECT
F.*,
MAX( IF(A.AttrName = 'ReadOnly', FA.AttrValue, NULL) ) as 'ReadOnly',
MAX( IF(A.AttrName = 'FileFormat', FA.AttrValue, NULL) ) as 'FileFormat',
MAX( IF(A.AttrName = 'Private', FA.AttrValue, NULL) ) as 'Private',
MAX( IF(A.AttrName = 'LastModified', FA.AttrValue, NULL) ) as 'LastModified'
FROM 
Files F 
LEFT JOIN Files_Attributes FA USING (FileID)
LEFT JOIN Attributes A USING (AttributeID)
GROUP BY
F.FileID;
FileID  Title           Author  CreatedOn   ReadOnly    FileFormat  Private LastModified
1       TestFile        Joe     2011-01-01  true        xls         false   2011-10-03
2       LongNovel       Mary    2011-02-01  true        json        true    2011-10-04
3       ShortStory      Susan   2011-03-01  false       ascii       false   2011-10-01
4       ProfitLoss      Bill    2011-04-01  false       text        true    2011-10-02
5       MonthlyBudget   George  2011-05-01  false       binary      false   2011-10-20

答案 3 :(得分:6)

这是SQL中标准的“行到列”问题。

最容易在SQL之外完成。

在您的应用程序中,执行以下操作:

  1. 定义一个简单的类来包含文件,系统属性和用户属性集合。列表是此客户属性集合的不错选择。我们称这个类为FileDescription。

  2. 在文件和文件的所有客户属性之间执行简单连接。

  3. 编写循环以从查询结果中组装FileDescriptions。

    • 获取第一行,创建FileDescription并设置第一个客户属性。

    • 虽然还有更多行要提取:

      • 获取一行
      • 如果此行的文件名与我们正在构建的FileDescription不匹配:完成构建FileDescription;将此附加到文件描述集合的结果;使用给定名称和第一个客户属性创建一个新的空FileDescription。
      • 如果此行的文件名与我们正在构建的FileDescription匹配:将另一个customer属性附加到当前的FileDescription

答案 4 :(得分:3)

我一直在尝试不同的答案,Methai的答案对我来说最方便。我目前的项目虽然确实使用了Doctrine和MySQL,但它有很多松散的表格。

以下是我对Methai解决方案的经验结果:

创建实体表

DROP TABLE IF EXISTS entity;
CREATE TABLE entity (
    id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
    title VARCHAR(255),
    author VARCHAR(255),
    createdOn DATETIME NOT NULL
) Engine = InnoDB;

创建属性表

DROP TABLE IF EXISTS attribute;
CREATE TABLE attribute (
    id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(255) NOT NULL,
    type VARCHAR(255) NOT NULL
) Engine = InnoDB;

创建属性值表

DROP TABLE IF EXISTS attributevalue;
CREATE TABLE attributevalue (
    id INT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
    value VARCHAR(255) NOT NULL,
    attribute_id INT UNSIGNED NOT NULL,
    FOREIGN KEY(attribute_id) REFERENCES attribute(id)
 ) Engine = InnoDB;

创建entity_attributevalue联接表

DROP TABLE IF EXISTS entity_attributevalue;
CREATE TABLE entity_attributevalue (
    entity_id INT UNSIGNED NOT NULL,
    attributevalue_id INT UNSIGNED NOT NULL,
    FOREIGN KEY(entity_id) REFERENCES entity(id),
    FOREIGN KEY(attributevalue_id) REFERENCES attributevalue(id)
) Engine = InnoDB;

填充实体表

INSERT INTO entity
    (title, author, createdOn)
VALUES
    ('TestFile', 'Joe', '2011-01-01'),
    ('LongNovel', 'Mary', '2011-02-01'),
    ('ShortStory', 'Susan', '2011-03-01'),
    ('ProfitLoss', 'Bill', '2011-04-01'),
    ('MonthlyBudget', 'George', '2011-05-01'),
    ('Paper', 'Jane', '2012-04-01'),
    ('Essay', 'John', '2012-03-01'),
    ('Article', 'Dan', '2012-12-01');

填充属性表

INSERT INTO attribute
    (name, type)
VALUES
    ('ReadOnly', 'bool'),
    ('FileFormat', 'text'),
    ('Private', 'bool'),
    ('LastModified', 'date');

填充属性值表

INSERT INTO attributevalue 
    (value, attribute_id)
VALUES
    ('true', '1'),
    ('xls', '2'),
    ('false', '3'),
    ('2011-10-03', '4'),
    ('true', '1'),
    ('json', '2'),
    ('true', '3'),
    ('2011-10-04', '4'),
    ('false', '1'),
    ('ascii', '2'),
    ('false', '3'),
    ('2011-10-01', '4'),
    ('false', '1'),
    ('text', '2'),
    ('true', '3'),
    ('2011-10-02', '4'),
    ('false', '1'),
    ('binary', '2'),
    ('false', '3'),
    ('2011-10-20', '4'),
    ('doc', '2'),
    ('false', '3'),
    ('2011-10-20', '4'),
    ('rtf', '2'),
    ('2011-10-20', '4');

填充entity_attributevalue表

INSERT INTO entity_attributevalue 
    (entity_id, attributevalue_id)
VALUES
    ('1', '1'),
    ('1', '2'),
    ('1', '3'),
    ('1', '4'),
    ('2', '5'),
    ('2', '6'),
    ('2', '7'),
    ('2', '8'),
    ('3', '9'),
    ('3', '10'),
    ('3', '11'),
    ('3', '12'),
    ('4', '13'),
    ('4', '14'),
    ('4', '15'),
    ('4', '16'),
    ('5', '17'),
    ('5', '18'),
    ('5', '19'),
    ('5', '20'),
    ('6', '21'),
    ('6', '22'),
    ('6', '23'),
    ('7', '24'),
    ('7', '25');

显示所有记录

SELECT * 
FROM `entity` e
LEFT JOIN `entity_attributevalue` ea ON ea.entity_id = e.id
LEFT JOIN `attributevalue` av ON ea.attributevalue_id = av.id
LEFT JOIN `attribute` a ON av.attribute_id = a.id;
id  title           author  createdOn           entity_id   attributevalue_id   id      value       attribute_id    id      name            type
1   TestFile        Joe     2011-01-01 00:00:00 1           1                   1       true        1               1       ReadOnly        bool
1   TestFile        Joe     2011-01-01 00:00:00 1           2                   2       xls         2               2       FileFormat      text
1   TestFile        Joe     2011-01-01 00:00:00 1           3                   3       false       3               3       Private         bool
1   TestFile        Joe     2011-01-01 00:00:00 1           4                   4       2011-10-03  4               4       LastModified    date
2   LongNovel       Mary    2011-02-01 00:00:00 2           5                   5       true        1               1       ReadOnly        bool
2   LongNovel       Mary    2011-02-01 00:00:00 2           6                   6       json        2               2       FileFormat      text
2   LongNovel       Mary    2011-02-01 00:00:00 2           7                   7       true        3               3       Private         bool
2   LongNovel       Mary    2011-02-01 00:00:00 2           8                   8       2011-10-04  4               4       LastModified    date
3   ShortStory      Susan   2011-03-01 00:00:00 3           9                   9       false       1               1       ReadOnly        bool
3   ShortStory      Susan   2011-03-01 00:00:00 3           10                  10      ascii       2               2       FileFormat      text
3   ShortStory      Susan   2011-03-01 00:00:00 3           11                  11      false       3               3       Private         bool
3   ShortStory      Susan   2011-03-01 00:00:00 3           12                  12      2011-10-01  4               4       LastModified    date
4   ProfitLoss      Bill    2011-04-01 00:00:00 4           13                  13      false       1               1       ReadOnly        bool
4   ProfitLoss      Bill    2011-04-01 00:00:00 4           14                  14      text        2               2       FileFormat      text
4   ProfitLoss      Bill    2011-04-01 00:00:00 4           15                  15      true        3               3       Private         bool
4   ProfitLoss      Bill    2011-04-01 00:00:00 4           16                  16      2011-10-02  4               4       LastModified    date
5   MonthlyBudget   George  2011-05-01 00:00:00 5           17                  17      false       1               1       ReadOnly        bool
5   MonthlyBudget   George  2011-05-01 00:00:00 5           18                  18      binary      2               2       FileFormat      text
5   MonthlyBudget   George  2011-05-01 00:00:00 5           19                  19      false       3               3       Private         bool
5   MonthlyBudget   George  2011-05-01 00:00:00 5           20                  20      2011-10-20  4               4       LastModified    date
6   Paper           Jane    2012-04-01 00:00:00 6           21                  21      binary      2               2       FileFormat      text
6   Paper           Jane    2012-04-01 00:00:00 6           22                  22      false       3               3       Private         bool
6   Paper           Jane    2012-04-01 00:00:00 6           23                  23      2011-10-20  4               4       LastModified    date
7   Essay           John    2012-03-01 00:00:00 7           24                  24      binary      2               2       FileFormat      text
7   Essay           John    2012-03-01 00:00:00 7           25                  25      2011-10-20  4               4       LastModified    date
8   Article         Dan     2012-12-01 00:00:00 NULL        NULL                NULL    NULL        NULL            NULL    NULL            NULL

数据透视表

SELECT e.*,
    MAX( IF(a.name = 'ReadOnly', av.value, NULL) ) as 'ReadOnly',
    MAX( IF(a.name = 'FileFormat', av.value, NULL) ) as 'FileFormat',
    MAX( IF(a.name = 'Private', av.value, NULL) ) as 'Private',
    MAX( IF(a.name = 'LastModified', av.value, NULL) ) as 'LastModified'
FROM `entity` e
LEFT JOIN `entity_attributevalue` ea ON ea.entity_id = e.id
LEFT JOIN `attributevalue` av ON ea.attributevalue_id = av.id
LEFT JOIN `attribute` a ON av.attribute_id = a.id
GROUP BY e.id;
id  title           author  createdOn           ReadOnly    FileFormat  Private LastModified
1   TestFile        Joe     2011-01-01 00:00:00 true        xls         false   2011-10-03
2   LongNovel       Mary    2011-02-01 00:00:00 true        json        true    2011-10-04
3   ShortStory      Susan   2011-03-01 00:00:00 false       ascii       false   2011-10-01
4   ProfitLoss      Bill    2011-04-01 00:00:00 false       text        true    2011-10-02
5   MonthlyBudget   George  2011-05-01 00:00:00 false       binary      false   2011-10-20
6   Paper           Jane    2012-04-01 00:00:00 NULL        binary      false   2011-10-20
7   Essay           John    2012-03-01 00:00:00 NULL        binary      NULL    2011-10-20
8   Article         Dan     2012-12-01 00:00:00 NULL        NULL        NULL    NULL

答案 5 :(得分:0)

然而,有一些解决方案可以将行用作列,也就是转置数据。 它涉及在纯SQL中执行它的查询技巧,或者您将不得不依赖某些特定数据库中的某些功能,使用数据透视表(或交叉表)。

As exemple you can see how to do this here in Oracle (11g).

编程版本的维护和制作将更加简单,而且可以与任何数据库一起使用。

答案 6 :(得分:-2)

部分答案,因为我不知道MySQL(好)。在MSSQL中,我会查看数据透视表或在存储过程中创建临时表。这可能很难......