添加PRIMARY KEY:表包含重复的值

时间:2013-02-04 14:12:09

标签: postgresql foreign-keys primary-key plpgsql postgresql-8.4

在PostgreSQL 8.4.13中我有2个表和一个程序来填充第二个表:

    create table pref_users (
            id varchar(32) primary key,
            first_name varchar(64),
            last_name varchar(64),
            female boolean,
            avatar varchar(128),
            city varchar(64),
            login timestamp default current_timestamp,
            logout timestamp,
            last_ip inet,
            vip timestamp,
            mail varchar(256)
    );

    create table pref_rep (
            rep_id serial,
            id varchar(32) references pref_users(id) check (id <> author) on delete cascade,
            author varchar(32) references pref_users(id) on delete cascade,
            author_ip inet,
            good boolean,
            fair boolean,
            nice boolean,
            about varchar(256),
            stamp timestamp default current_timestamp
            /* primary key(id, author) */
    );

   create or replace function pref_update_rep(_id varchar,
            _author varchar, _author_ip inet,
            _good boolean, _fair boolean, _nice boolean,
            _about varchar) returns void as $BODY$
            begin

            delete from pref_rep
            where id = _id and
            age(stamp) < interval '1 hour' and
            (author_ip & '255.255.255.0'::inet) =
            (_author_ip & '255.255.255.0'::inet);

            update pref_rep set
                author    = _author,
                author_ip = _author_ip,
                good      = _good,
                fair      = _fair,
                nice      = _nice,
                about     = _about,
                stamp     = current_timestamp
            where id = _id and author = _author;

            if not found then
                    insert into pref_rep(id, author, author_ip, good, fair, nice, about)
                    values (_id, _author, _author_ip, _good, _fair, _nice, _about);
            end if;
            end;
    $BODY$ language plpgsql;

pref_users表包含有关用户的一般信息。

pref_rep包含有关其他用户(列about)创建的用户(列id)的评论(列author)。

对于第二张表,我忘记申报primary key对(该行在上面评论过)。

我正在尝试在psql提示符处添加该主键,但它失败了 - 可能是因为某些原因(我不知道上面的程序怎么会发生?)我有几条记录同样{{ 1}}多次评论author

id

我的问题是如何找到重复的# alter table pref_rep add primary key(id, author); NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index "pref_rep_pkey" for table "pref_rep" ERROR: could not create unique index "pref_rep_pkey" DETAIL: Table contains duplicated values. id对?

我试过了:

author

但那当然不会给我这双......

更新: Catcall的建议(谢谢!)给了我190个这样的重复对:

# select id, count(id) from pref_rep group by id order by count desc limit 5;
       id       | count
----------------+-------
 OK408547485023 |   706
 OK261593357402 |   582
 DE11198        |   561
 DE13041        |   560
 OK347613386893 |   556
(5 rows)

但实际上我真正的问题是如何删除重复项的旧版本(通过 id | author | count ------------------------+------------------------+------- DE10598 | OK495480409724 | 2 DE12188 | MR17925810634439466500 | 3 DE13529 | OK471161192902 | 2 DE13963 | OK434087948702 | 2 DE14037 | DE7692 | 2 ...... VK45132921 | DE3544 | 2 VK6152782 | OK261593357402 | 2 VK72883921 | OK506067284178 | 2 (190 rows) 列)?我在psql提示符下尝试了很多查询失败...

2 个答案:

答案 0 :(得分:2)

这应该识别重复项。

select id, author 
from pref_rep
group by id, author
having count(id) > 1

您可能还必须查看NULL,因为这两列都允许NULL。

答案 1 :(得分:1)

此查询的内容(也在SQL Fiddle上)?

DELETE FROM pref_rep p USING (
  SELECT id, author, max(stamp) stamp
    FROM pref_rep
   GROUP BY id, author
  HAVING count(1) > 1) AS f
WHERE p.id=f.id AND p.author=f.author AND p.stamp<f.stamp;

检查count()功能上的manual

您可以指定任何表达式。 1表示将计算所有行,'cos 1永远不会NULL。使用count(*)时效果相同。事实上我更喜欢后者,不知道为什么我这次使用count(1):)