PostgreSQL - 如何以正确的方式加入M:M表?

时间:2014-05-12 22:31:37

标签: sql postgresql join database-design left-join

我的数据库结构如下所示:

CREATE TABLE categories (
    name VARCHAR(30) PRIMARY KEY
);

CREATE TABLE additives (
    name VARCHAR(30) PRIMARY KEY
);

CREATE TABLE beverages (
    name VARCHAR(30) PRIMARY KEY,
    description VARCHAR(200),
    price NUMERIC(5, 2) NOT NULL CHECK (price >= 0),
    category VARCHAR(30) NOT NULL REFERENCES categories(name) ON DELETE CASCADE ON UPDATE CASCADE
);

CREATE TABLE b_additives_xref (
    bname VARCHAR(30) REFERENCES beverages(name) ON DELETE CASCADE ON UPDATE CASCADE,
    aname VARCHAR(30) REFERENCES additives(name) ON DELETE CASCADE ON UPDATE CASCADE, 
    PRIMARY KEY(bname, aname)
);


INSERT INTO categories VALUES
    ('Cocktails'), ('Biere'), ('Alkoholfreies');

INSERT INTO additives VALUES 
    ('Kaliumphosphat (E 340)'), ('Pektin (E 440)'), ('Citronensäure (E 330)');

INSERT INTO beverages VALUES
    ('Mojito Speciale', 'Cocktail mit Rum, Rohrzucker und Minze', 8, 'Cocktails'),
    ('Franziskaner Weißbier', 'Köstlich mildes Hefeweizen', 6, 'Biere'),
    ('Augustiner Hell', 'Frisch gekühlt vom Fass', 5, 'Biere'),
    ('Coca Cola', 'Coffeeinhaltiges Erfrischungsgetränk', 2.75, 'Alkoholfreies'),
    ('Sprite', 'Erfrischende Zitronenlimonade', 2.50, 'Alkoholfreies'),
    ('Karaffe Wasser', 'Kaltes, gashaltiges Wasser', 6.50, 'Alkoholfreies');

INSERT INTO b_additives_xref VALUES
    ('Coca Cola', 'Kaliumphosphat (E 340)'),
    ('Coca Cola', 'Pektin (E 440)'),
    ('Coca Cola', 'Citronensäure (E 330)');

SqlFiddle

我想要实现的是列出所有饮料及其属性(pricedescription等),并从additives表中添加另一列b_additives_xref,它与每种饮料中含有的所有添加剂结合在一起。

我的查询目前看起来像这样,并且几乎正常工作(我猜):

SELECT 
    beverages.name AS name, 
    beverages.description AS description, 
    beverages.price AS price,
    beverages.category AS category, 
    string_agg(additives.name, ', ') AS additives 
FROM beverages, additives
    LEFT JOIN b_additives_xref ON b_additives_xref.aname = additives.name 
GROUP BY beverages.name
ORDER BY beverages.category;

输出如下:

Coca Cola       | Coffeeinhaltiges Erfrischungsgetränk | 2.75 | Alkoholfreies | Kaliumphosphat (E 340), Pektin (E 440), Citronensäure (E 330)
Karaffe Wasser  | Kaltes, gashaltiges Wasser           | 6.50 | Alkoholfreies | Kaliumphosphat (E 340), Pektin (E 440), Citronensäure (E 330)
Sprite          | Erfrischende Zitronenlimonade        | 2.50 | Alkoholfreies | Kaliumphosphat (E 340), Pektin (E 440), Citronensäure (E 330)
Augustiner Hell | Frisch gekühlt vom Fass              | 5.00 | Biere         | Kaliumphosphat (E 340)[...]

当然,这是错误的,因为只有'可口可乐'在b_additives_xref表格中有现有行。
除了“可口可乐”行之外,所有其他行在“添加剂”列中应具有“空”或“空字段”值。我做错了什么?

3 个答案:

答案 0 :(得分:2)

关于你的一些建议

模式

CREATE TABLE category (
   category_id int PRIMARY KEY
  ,category    text UNIQUE NOT NULL
);

CREATE TABLE beverage (
   beverage_id serial PRIMARY KEY
  ,beverage    text UNIQUE NOT NULL  -- maybe not unique?
  ,description text
  ,price       int NOT NULL CHECK (price >= 0)  -- in Cent
  ,category_id int NOT NULL REFERENCES category ON UPDATE CASCADE
                                        -- not: ON DELETE CASCADE 
);

CREATE TABLE additive (
   additive_id serial PRIMARY KEY
  ,additive    text UNIQUE
);

CREATE TABLE bev_add (
    beverage_id int REFERENCES beverage ON DELETE CASCADE ON UPDATE CASCADE
   ,additive_id int REFERENCES additive ON DELETE CASCADE ON UPDATE CASCADE 
   ,PRIMARY KEY(beverage_id, additive_id)
);
  • 切勿使用“名称”作为名称。这是一个可怕的,非描述性的名称。
  • 使用小型代理主键,对于大表,最好使用serial列,对于小表,使用简单integer列。有可能,饮料和添加剂的名称并不是严格独特的,您希望不时更改它们,这使它们成为主键的不良候选者。 integer列也更小,处理速度更快。
  • 如果您只有少数几个没有其他属性的类别,请考虑使用enum
  • 当外键和主键具有相同的值时,最好使用相同的(描述性)名称。
  • 我从不使用复数形式作为表名,除非单行包含多个实例。更短,只是一个有意义的,留下复数实际的多行。
  • Just use text instead of character varying (n).
  • 在使用ON DELETE CASCADE向查找表定义fk约束之前,请三思 通常情况下,如果您删除某个类别(错误地),会自动删除所有饮料。
  • 考虑使用普通integer列而不是NUMERIC(5, 2)(使用Cent而不是€/ $)。更小,更快,更简单。 需要时输出格式。

这个密切相关答案的更多建议和链接:
How to implement a many-to-many relationship in PostgreSQL?

查询

适应新架构和一些一般性建议。

SELECT b.*, string_agg(a.additive, ', ' ORDER BY a.additive) AS additives
                                     -- order by optional for sorted list
FROM   beverage      b
JOIN   category      c USING (category_id)
LEFT   JOIN bev_add ba USING (beverage_id)  -- simpler now
LEFT   JOIN additive a USING (additive_id)
GROUP  BY b.beverage_id, c.category_id
ORDER  BY c.category;
  • 如果列名与别名相同,则不需要列别名。
  • 根据建议的命名约定,您可以方便地使用USING in joins
  • 您还需要加入categoryGROUP BY category_idcategory(建议架构的缺点)。
  • 对于大表,查询仍然会更快,因为表格较小,索引较小,速度较快,需要读取的页面较少。

答案 1 :(得分:1)

我相信你正在寻找这个

SELECT 
    B.name AS name, 
    B.description AS description, 
    B.price AS price,
    B.category AS category, 
    string_agg(A.name, ', ') AS additives 
FROM Beverages B
    LEFT JOIN b_additives_xref xref ON xref.bname = B.name 
    Left join additives A on A.name = xref.aname
GROUP BY B.name
ORDER BY B.category;

输出

NAME    DESCRIPTION                                 PRICE   CATEGORY        ADDITIVES
Coca Cola   Coffeeinhaltiges Erfrischungsgetränk    2.75    Alkoholfreies   Kaliumphosphat (E 340), Pektin (E 440), Citronensäure (E 330)

问题是您在beveragesadditives表之间有笛卡尔积?

FROM beverages, additives

每条记录都有其他记录。它们都需要显式连接到外部参照表。

答案 2 :(得分:1)

我正在寻找的查询如下:

SELECT 
    B.name AS name, 
    B.description AS description, 
    B.price AS price,
    B.category AS category, 
    string_agg(A.name, ', ') AS additives 
FROM beverages B
    LEFT JOIN b_additives_xref xref ON xref.bname = B.name 
    LEFT JOIN additives A on A.name = xref.aname
GROUP BY B.name
ORDER BY B.category;

积分转到布拉德,因为他在答案中给了我解决方案。评价。