SQL如何选择具有重复id的所有行但不是每个行的第一行?

时间:2017-12-08 20:20:19

标签: sql postgresql select

我有一个包含重复ID的数据库,我想更改这些重复ID的名称。

所以,我有一个车辆数据库,在列中我有许可证编号(id),车辆类型,颜色和品牌。

所有id都在数据库中重复,我想要除了每个不同ID的第一行之外的所有ID(类似于" DISTINCT id"但是反...)。

编辑2:

我创建了这个表

DROP TABLE IF EXISTS Proces1 CASCADE;
CREATE TABLE Proces1 AS
(
    SELECT id_importat AS id_aux, driver_city AS city_aux, driver_state AS state_aux, gender AS g_aux, race AS r_aux
    FROM ImportaViolations
    WHERE id_importat IN (
        SELECT id_importat
        FROM ImportaViolations
        GROUP BY id_importat    
        HAVING (COUNT(*) > 1))
    GROUP BY id_importat, driver_city, driver_state, gender, race
);

在此表中,我重复了ID,但列中的信息不同。

类似的东西:

id_aux   city_aux      state_aux     g_aux     r_aux
1        London        England        M        WHITE
1        London        England        F        BLACK
2        Madrid        Spain          M        BLACK
2        London        England        F        WHITE
2        London        England        M        WHITE
...

所以现在,我想用重复的id_aux选择所有行,除了每个不同的id_aux的第一行。所以我希望得到这个最终结果(在这个例子中):

id_aux   city_aux      state_aux      g_aux    r_aux
1        London        England        F        BLACK
2        London        England        F        WHITE
2        London        England        M        WHITE
...

3 个答案:

答案 0 :(得分:0)

我只是修改了这个问题的答案,以更好地满足您的需求: Select first row in each GROUP BY group?

基本上,我不是使用rk = 1选择每个组中的第一个,而是将其更改为rk > 1并将列名称切换为与您的匹配。

WITH MyTable AS (
    SELECT p.Id, 
           p.Column1, 
           p.Column2, 
           ROW_NUMBER() OVER(PARTITION BY p.Column1 
                             ORDER BY p.Column2 DESC) AS rk
      FROM MyTable p)
SELECT s.*
  FROM MyTable t
 WHERE s.rk > 1

编辑:将rk = 2更改为rk > 1以选择除第一个之外的所有内容,而不仅仅是第二个。

答案 1 :(得分:0)

这样的东西?

SELECT
    ID
    , Column1
    , Column2
FROM
    (
        SELECT
            ID
            , Column1
            , Column2
            , ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Column1, Column2) R
        FROM YourTable
    ) Q
WHERE R > 1

编辑2的更新:

SELECT
    id_aux
    , city_aux
    , state_aux
    , g_aux
    , r_aux
FROM
    (
        SELECT
            id_aux
            , city_aux
            , state_aux
            , g_aux
            , r_aux
            , ROW_NUMBER() OVER (PARTITION BY id_aux ORDER BY id_aux) R
        FROM YourTable
    ) Q
WHERE R > 1

答案 2 :(得分:-1)

请在以下演示中密切关注行的顺序。 &安培;请注意我还添加了一些额外的行。最初我们在ImportaViolations创建“随机”行时为此示例创建每个id_importat的“第一行”

INSERT INTO ImportaViolations
    (id_importat, driver_city, driver_state, gender, race)
VALUES
    (1, 'London', 'England', 'M', 'WHITE'),
    (2, 'Madrid', 'Spain', 'M', 'BLACK'),

但是如果我们运行this query(没有“订购依据”):

SELECT id_importat AS id_aux, driver_city AS city_aux
     , driver_state AS state_aux, gender AS g_aux, race AS r_aux
     , rn
FROM (
      select id_importat, driver_city, driver_state, gender, race
           , row_number() over(partition by id_importat) as rn
      from ImportaViolations
     ) d
WHERE rn = 1

结果如下:

| id_aux | city_aux | state_aux | g_aux | r_aux | rn |
|--------|----------|-----------|-------|-------|----|
|      1 |   London |   England |     M | WHITE |  1 |
|      2 |   London |   England |     F | WHITE |  1 |

该结果受必要的partition by的影响(如果没有这个,表中总共有一行,行号为1)。

所以: 这个故事的寓意是你必须仔细考虑ORDER来确定每个id_importat的“第一行”。

SQL Fiddle Demo

CREATE TABLE ImportaViolations
    (id_importat int, driver_city varchar(6), driver_state varchar(7), gender varchar(1), race varchar(5))
;

INSERT INTO ImportaViolations
    (id_importat, driver_city, driver_state, gender, race)
VALUES
    (1, 'London', 'England', 'M', 'WHITE'),
    (2, 'Madrid', 'Spain', 'M', 'BLACK'),
    (1, 'London', 'England', 'F', 'BLACK'),
    (2, 'London', 'England', 'M', 'WHITE'),
    (1, 'London', 'England', 'F', 'BLACK'),
    (2, 'Madrid', 'Spain', 'M', 'BLACK'),
    (2, 'London', 'England', 'F', 'WHITE'),
    (1, 'London', 'England', 'M', 'WHITE'),
    (2, 'London', 'England', 'F', 'WHITE')
;

主要查询

DROP TABLE IF EXISTS Proces1 CASCADE;
CREATE TABLE Proces1 AS 
(
    SELECT id_importat AS id_aux, driver_city AS city_aux
         , driver_state AS state_aux, gender AS g_aux, race AS r_aux
         , rn
    FROM (
          select id_importat, driver_city, driver_state, gender, race
               , row_number() over(partition by id_importat order by 1) as rn
          from ImportaViolations
         ) d
    WHERE rn > 1
);

查询1

select * from Proces1

<强> Results

| id_aux | city_aux | state_aux | g_aux | r_aux | rn |
|--------|----------|-----------|-------|-------|----|
|      1 |   London |   England |     F | BLACK |  2 |
|      1 |   London |   England |     F | BLACK |  3 |
|      1 |   London |   England |     M | WHITE |  4 |
|      2 |   Madrid |     Spain |     M | BLACK |  2 |
|      2 |   Madrid |     Spain |     M | BLACK |  3 |
|      2 |   London |   England |     F | WHITE |  4 |
|      2 |   London |   England |     M | WHITE |  5 |

查询2

select * from ImportaViolations

<强> Results

| id_importat | driver_city | driver_state | gender |  race |
|-------------|-------------|--------------|--------|-------|
|           1 |      London |      England |      M | WHITE |
|           2 |      Madrid |        Spain |      M | BLACK |
|           1 |      London |      England |      F | BLACK |
|           2 |      London |      England |      M | WHITE |
|           1 |      London |      England |      F | BLACK |
|           2 |      Madrid |        Spain |      M | BLACK |
|           2 |      London |      England |      F | WHITE |
|           1 |      London |      England |      M | WHITE |
|           2 |      London |      England |      F | WHITE |