如何保护表以避免重复数据

时间:2017-02-25 14:24:51

标签: sql postgresql database-design unique-constraint relational-division

我无法解决问题,我的表如何安全,以避免重复组合attributes_positions。向您展示我的意思的最佳方式是以下图片

enter image description here

id_combination 表示组合的数量。组合由 attributes_positions 组成。因此,组合是 attributes_positions 的序列。

现在我要从插入exaclty中保护表格与 attributes_positions 相同的序列。

当然,如果已经插入的组合包含一个额外的 attributes_positions ,或者只比插入组合少一个

图像我显示了不同的重复和不重复的组合。

我有什么方法可以做到这一点? Meaby就像'更新前'。但是如何实现这个例子。我用高级sql不太好用。 我试图保护表的数据库是postgresql 9.4

我将非常感谢您的帮助

3 个答案:

答案 0 :(得分:0)

我的回答是假设目标没有欺骗,我们想要插入一个新的集合 - 这恰好是重复的。我选择4的组,IBusinessArticle businessArticle = new BusinessArticle(); businessArticle.Add(new Article { Title = "Test MongoDB", Body = "Body Body Body Body Body Body" }); 为1。

您必须将4组放入临时表中。然后,您必须水平转动分段和目标 - 这样您就可以获得名为id_comb的5列到attr_pos1(示例中最大的组为5)。要进行透视,您需要一个序列号,我们可以使用ROW_NUMBER()获得序列号。对于表,分段和目标都是如此。然后,你转动两个。然后,您尝试在所有5个attr_pos5列上加入pivoted staging和target,并计算行数。如果你得到0,你没有重复。如果你得到1,你就有重复。

这是整个情景:

attr_pos#

希望这有帮助---- 马可

答案 1 :(得分:0)

        -- The data
CREATE TABLE theset (
        set_id INTEGER NOT NULL PRIMARY KEY
        , set_name text UNIQUE
        );
INSERT INTO theset(set_id, set_name) VALUES
( 1, 'one'), ( 2, 'two'), ( 3, 'three'), ( 4, 'four');

CREATE TABLE theitem (
        item_id integer NOT NULL PRIMARY KEY
        , item_name text UNIQUE
        );
INSERT INTO theitem(item_id, item_name) VALUES
( 1, 'one'), ( 2, 'two'), ( 3, 'three'), ( 4, 'four'), ( 5, 'five');

CREATE TABLE set_item (
        set_id integer NOT NULL REFERENCES theset (set_id)
        , item_id integer NOT NULL REFERENCES theitem(item_id)
        , PRIMARY KEY (set_id,item_id)
        );
        -- swapped index is indicated for junction tables
CREATE UNIQUE INDEX ON set_item(item_id, set_id);

INSERT INTO set_item(set_id,item_id) VALUES
(1,1), (1,2), (1,3), (1,4),
(2,1), (2,2), (2,3), -- (2,4),
(3,1), (3,2), (3,3), (3,4), (3,5),
(4,1), (4,2), (4,4);

CREATE FUNCTION set_item_unique_set( ) RETURNS TRIGGER AS
$func$
BEGIN
IF EXISTS ( -- other set
        SELECT * FROM theset oth
        -- WHERE oth.set_id <> NEW.set_id -- only for insert/update
        WHERE TG_OP = 'DELETE' AND oth.set_id <> OLD.set_id
           OR TG_OP <> 'DELETE' AND oth.set_id <> NEW.set_id

        -- count (common) members in the two sets
        -- items not in common will have count=1
        AND NOT EXISTS (
                SELECT item_id FROM set_item x1
                WHERE (x1.set_id = NEW.set_id OR x1.set_id = oth.set_id )
                GROUP BY item_id
                HAVING COUNT(*) = 1
                )

        ) THEN
        RAISE EXCEPTION 'Not unique set';
        RETURN NULL;
ELSE
        RETURN NEW;
END IF;

END;
$func$ LANGUAGE 'plpgsql'
        ;

CREATE CONSTRAINT TRIGGER check_item_set_unique
        AFTER UPDATE OR INSERT OR DELETE
        -- BEFORE UPDATE OR INSERT
        ON set_item
        FOR EACH ROW
        EXECUTE PROCEDURE set_item_unique_set()
        ;

-- Test it
INSERT INTO set_item(set_id,item_id) VALUES(4,5); -- success
INSERT INTO set_item(set_id,item_id) VALUES(2,4); -- failure
DELETE FROM set_item WHERE set_id=1 AND item_id= 4; -- failure

注意:DELETE案例也应该有触发器。

更新:添加了对DELETE

的处理

(删除的处理并不完美;想象一下删除集合中最后一个元素的情况)

答案 2 :(得分:0)

@wildplasser有趣但不是非常有用的解决方案。我创建脚本来插入样本数据:

myfilteredlist = list(filter(lambda s: s in letters, mylist))

当我用8(set_item为1024 - 2 ^ 8行)调用它时,运行21秒。太糟糕了。当我关闭触发器时,花费不到1毫秒。

我的提议

在这种情况下使用数组非常有趣。不幸的是PostgreSQL不支持数组的foreighn键,但它可能由TRIGGERs完成。我删除WITH param AS ( SELECT 8 AS max ), maxarray AS ( SELECT array_agg(i) as ma FROM (SELECT generate_series(1, max) as i FROM param) as i ), pre AS ( SELECT * FROM ( SELECT *, CASE WHEN (id >> mbit) & 1 = 1 THEN ma[mbit + 1] END AS item_id FROM ( SELECT *, generate_series(0, array_upper(ma, 1) - 1) as mbit FROM ( SELECT *, generate_series(1,(2^max - 1)::int8) AS id FROM param, maxarray ) AS pre1 ) AS pre2 ) AS pre3 WHERE item_id IS NOT NULL ), ins_item AS ( INSERT INTO theitem (item_id, item_name) SELECT i, i::text FROM generate_series(1, (SELECT max FROM param)) as i RETURNING * ), ins_set AS ( INSERT INTO theset (set_id, set_name) SELECT id, id::text FROM generate_series(1, (SELECT 2^max - 1 FROM param)::int8) as id RETURNING * ), ins_set_item AS ( INSERT INTO set_item (set_id, item_id) SELECT id, item_id FROM pre WHERE (SELECT count(*) FROM ins_item) > 0 AND (SELECT count(*) FROM ins_set) > 0 RETURNING * ) SELECT 'sets', count(*) FROM ins_set UNION ALL SELECT 'items', count(*) FROM ins_item UNION ALL SELECT 'sets_items', count(*) FROM ins_set_item ; 表并为set_item添加items int[]字段:

theset

此变体的运行时间不到1毫秒