INSERT ......在冲突时什么都不做 - 在csv中读取并生成外键表

时间:2017-02-25 01:57:34

标签: sql postgresql insert duplicates conflict

我尝试使用列艺术家,专辑,歌曲和标签来读取csv文件。

我希望像这样填充artist_album_song表:

|artist_id|album_id|song_id|
|---------|--------|-------|
|   1     |     1  |     1 |
|   1     |     1  |     2 |
|   1     |     2  |     1 |
...
|  12     |     1  |     1 |
...

我已经设计好了,现在我正在尝试填充以下表格。我在csv中读到的问题是填充artist_album_song表中的外键。

什么是插入此表的最佳方式,它实现了我在下面使用的INSERT语句中尝试做的事情(返回语法错误)?感谢。

create table artists (
    artist_id SERIAL PRIMARY KEY,
    artist VARCHAR(100) NOT NULL UNIQUE
);

create table albums (
    album_id SERIAL PRIMARY KEY,
    album VARCHAR(100) NOT NULL UNIQUE
);

create table songs (
    song_id SERIAL PRIMARY KEY,
    song VARCHAR(250) NOT NULL UNIQUE
);

create table tags (
    tag_id SERIAL PRIMARY KEY,
    tag VARCHAR(100) NOT NULL UNIQUE
);

create table artists_albums_songs (
    artist_id INTEGER NOT NULL,
    album_id INTEGER NOT NULL,
    song_id INTEGER NOT NULL,
    FOREIGN KEY (artist_id) REFERENCES artists(artist_id),
    FOREIGN KEY (album_id) REFERENCES albums(album_id),
    FOREIGN KEY (song_id) REFERENCES songs(song_id),
    PRIMARY KEY (artist_id, album_id, song_id)
);

create table songs_tags (
    song_id INTEGER NOT NULL,
    tag_id INTEGER NOT NULL,
    FOREIGN KEY (song_id) REFERENCES songs(song_id),
    FOREIGN KEY (tag_id) REFERENCES tags(tag_id),
    PRIMARY KEY (song_id, tag_id)
);

在尝试了以下链接的各种语句变体后,我仍然无法使其发挥作用。

我已尝试过以下陈述,但我不断收到错误。第一个返回错误:

org.postgresql.util.PSQLException: ERROR: syntax error at or near "ON" Position: 161;

161是否引用了以下SQL语句中的第161个字符?

INSERT INTO artists_albums_songs
SELECT artist_id, album_id, song_id 
FROM artists a 
    JOIN albums b
        ON a.artist = ?
        AND b.album = ?
    JOIN songs c
        ON c.song = ?
    ON DUPLICATE (artist_id, album_id, song_id) DO NOTHING;

INSERT INTO artists_albums_songs
SELECT artist_id, album_id, song_id 
FROM artists a 
    JOIN albums b
        ON a.artist = ?
        AND b.album = ?
    JOIN songs c
        ON c.song = ?
    WHERE NOT EXISTS (
        SELECT * 
        FROM artists_albums_songs
        WHERE * = ?, ?, ?)

INSERT INTO artists_albums_songs
SELECT artist_id, album_id, song_id 
FROM artists a 
    JOIN albums b
        ON a.artist = ?
        AND b.album = ?
    JOIN songs c
        ON c.song = ?
    ON CONFLICT (song_id) IGNORE;

编辑:如果我删除上面3个INSERT语句中的最后一行,它就会起作用,但当它遇到重复时它会说:

org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "artists_albums_songs_pkey"
  Detail: Key (artist_id, album_id, song_id)=(1, 1, 1) already exists.

Insert, on duplicate update in PostgreSQL?

Use INSERT ... ON CONFLICT DO NOTHING RETURNING failed rows

How to UPSERT (MERGE, INSERT ... ON DUPLICATE UPDATE) in PostgreSQL?

2 个答案:

答案 0 :(得分:0)

编辑1: 我刚刚意识到我可以用Java处理这些错误!所以我的解决方案只需添加一个catch语句来处理Duplicate SQLException

private <T> void insertIntoArtistAlbumSong(T artist, T album, T song) throws SQLException {

    try {

        String artString = artist.toString();
        String albString = album.toString();
        String songString = song.toString();

        // Create SQL insert statement
        String stm =
                "INSERT INTO artists_albums_songs " +
                        "SELECT artist_id, album_id, song_id " +
                        "FROM artists a " +
                        "JOIN albums b " +
                        "ON a.artist = ? " +
                        "AND b.album = ? " +
                        "JOIN songs c " +
                        "ON c.song = ? ;";


        PreparedStatement pstmt = connection.prepareStatement(stm);

        // Set values in prepared statement
        pstmt.setString(1, artString);
        pstmt.setString(2, albString);
        pstmt.setString(3, songString);

        // Insert into table
        pstmt.executeUpdate();

    // ADDED THIS CATCH STATEMENT!
    } catch (SQLException e){
        System.out.println(e.getSQLState());
    }
}

好的,所以我找到了一个解决方案,但它只适用于填充表格(这是我实际需要做的)。

  1. 删除原始artists_albums_songs [1]表
  2. 创建 artists_albums_songs [2]表,不带约束:

    CREATE TABLE artists_albums_songs (
        artist_id INTEGER NOT NULL,
        album_id INTEGER NOT NULL,
        song_id INTEGER NOT NULL
    );
    
  3. 然后我使用以下语句(通过JDBC)填充 new 表[2]:

    INSERT INTO artists_albums_songs
    SELECT artist_id, album_id, song_id 
    FROM artists a 
        JOIN albums b
            ON a.artist = ?
            AND b.album = ?
        JOIN songs c
            ON c.song = ?;
    
  4. 使用约束创建tmp [3]表(通过psql命令行):

    CREATE TABLE tmp (
        artist_id INTEGER NOT NULL,
        album_id INTEGER NOT NULL,
        song_id INTEGER NOT NULL,
        FOREIGN KEY (artist_id) REFERENCES artists(artist_id),
        FOREIGN KEY (album_id) REFERENCES albums(album_id),
        FOREIGN KEY (song_id) REFERENCES songs(song_id),
        PRIMARY KEY (artist_id, album_id, song_id)
    );
    
  5. 仅将新artists_albums_songs [2]中的不同行插入tmp [3](通过psql):

    INSERT INTO tmp SELECT DISTINCT * FROM artists_albums_songs
    ORDER BY artist_id, album_id, song_id ASC;
    
  6. 删除新artists_albums_songs [2]并将tmp [3]重命名为artists_albums_songs(通过psql):

    DROP TABLE artists_albums_songs;
    ALTER TABLE tmp RENAME TO artists_albums_songs;
    

答案 1 :(得分:0)

此行出错:

 ON DUPLICATE (artist_id, album_id, song_id) DO NOTHING;

Postgtresql使用ON CONFLICT关键字 https://www.postgresql.org/docs/current/static/sql-insert.html