将多个CSV文件复制到postgres中

时间:2013-08-30 13:04:24

标签: postgresql

我正在编写一个SQL脚本来将多个.CSV文件复制到postgres数据库中,如下所示:

COPY product(title, department) from 'ys.csv' CSV HEADER;

我有多个要复制的文件。我不想要:

COPY product(title, department) from 'ys1.csv' CSV HEADER;
COPY product(title, department) from 'ys2.csv' CSV HEADER;
COPY product(title, department) from 'ys3.csv' CSV HEADER;
COPY product(title, department) from 'ys4.csv' CSV HEADER;
COPY product(title, department) from 'ys5.csv' CSV HEADER;

我想为此使用for循环而不是多个复制命令。这可能吗?感谢

6 个答案:

答案 0 :(得分:8)

在linux管道中输出的文件列表为psql。让copy使用标准输入:

cat /path_to/ys*.csv | psql -c 'COPY product(title, department) from stdin CSV HEADER'

在其他操作系统中查找等效内容

答案 1 :(得分:3)

我尝试了上面的答案但是在处理多个文件时遇到了错误。我认为在第二个文件中它并没有切断标题。

这对我有用:

# get filenames
IMPFILES=(path/FileNamepart.csv)

# import the files
for i in ${IMPFILES[@]}
    do
        psql -U user -d database -c "\copy TABLE_NAME from '$i' DELIMITER ';' CSV HEADER"
        # move the imported file
        mv $i /FilePath
    done

在我的情况下,我会在导入后移动每个文件。如果发生错误,我知道在哪里看。如果在该位置放置了新文件,我可以再次运行该脚本。

答案 2 :(得分:2)

从Postgres 9.3开始,您可以在COPY command中使用PROGRAM关键字运行shell命令。

COPY product(title, department) from PROGRAM 'cat ys*.csv' FORMAT CSV HEADER

答案 3 :(得分:1)

您可以使用pg_ls_dir遍历文件名。

DO $$

DECLARE file_path TEXT; -- Path where your CSV files are
DECLARE fn_i TEXT; -- Variable to hold name of current CSV file being inserted
DECLARE mytable TEXT; -- Variable to hold name of table to insert data into

BEGIN

    file_path := 'C:/Program Files/PostgreSQL/9.6/data/my_csvs/'; -- Declare the path to your CSV files. You probably need to put this in your PostgreSQL file path to avoid permission issues.
    mytable := 'product(title,department)'; -- Declare table to insert data into. You can give columns too since it's just going into an execute statement.

    CREATE TEMP TABLE files AS 
    SELECT file_path || pg_ls_dir AS fn -- get all of the files in the directory, prepending with file path
    FROM pg_ls_dir(file_path);

    LOOP    
        fn_i := (select fn from files limit 1); -- Pick the first file
        raise notice 'fn: %', fn_i;
        EXECUTE 'COPY ' || mytable || ' from ''' || fn_i || ''' with csv header';
        DELETE FROM files WHERE fn = fn_i; -- Delete the file just inserted from the queue
        EXIT  WHEN (SELECT COUNT(*) FROM files) = 0;
     END LOOP;

END $$;

答案 4 :(得分:0)

使用pg_ls_dir和format()还有一个选项。将“ E:\ Online_Monitoring \ Processed \”文件夹中的所有文件插入ONLMON_T_Online_Monitoring表中。

DO $$
DECLARE
  directory_path VARCHAR(500);
    rec RECORD;
BEGIN
  directory_path := 'E:\\Online_Monitoring\\Processed\\';
    FOR rec IN SELECT pg_ls_dir(directory_path) AS file_name
    LOOP
      EXECUTE format(
            '
                COPY ONLMON_T_Online_Monitoring
                    (
                        item
                    , storeCode
                    , data
                    )
                FROM %L
                WITH (FORMAT CSV, HEADER);
                ', directory_path || rec.file_name
        );
    END LOOP;
END; $$;

答案 5 :(得分:0)

如果您想使用df.groupby('column').transform('ngroup') (Postgres> 9.3)关键字,但是每个csv文件中都有标题,则可以使用const [parentUUID,,extraUUID] = UUIDs; let [,childUUID] = UUIDS;

PROGRAM