Question

我想将CSV文件复制到Postgres表。这个表中大约有100列，所以如果我不需要，我不想重写它们。

我正在使用\copy table from 'table.csv' delimiter ',' csv;命令，但如果没有创建表格，我会得到ERROR: relation "table" does not exist。如果我添加一个空白表我没有错误，但没有任何反应。我尝试了这个命令两三次，没有输出或消息，但是当我通过PGAdmin检查时表没有更新。

有没有办法导入包含标题的表格，就像我正在尝试的那样？

Answer 1

这很有用。第一行中有列名。

COPY wheat FROM 'wheat_crop_data.csv' DELIMITER ';' CSV HEADER

Answer 2

使用Python库pandas，您可以轻松地从csv文件创建列名并推断数据类型。

from sqlalchemy import create_engine
import pandas as pd

engine = create_engine('postgresql://user:pass@localhost/db_name')
df = pd.read_csv('/path/to/csv_file')
df.to_sql('pandas_db', engine)

可以将if_exists参数设置为替换或附加到现有表格，例如df.to_sql('pandas_db', engine, if_exists='replace')。这适用于其他输入文件类型，文档here和here。

Answer 3

未经许可的终端替代

pg documentation at NOTES 说

路径将相对于服务器进程的工作目录（通常是集群的数据目录）进行解释，而不是客户端的工作目录。

因此，从字面上看，使用if ($client->isAccessTokenExpired()) { $refreshToken = $client->getRefreshToken(); $client->refreshToken($refreshToken); $newAccessToken = $client->getAccessToken(); $newAccessToken['refresh_token'] = $refreshToken; file_put_contents($credentialsPath, json_encode($newAccessToken)); }或任何客户端，即使在本地服务器中，也存在问题......并且，如果您正在为其他用户表达COPY命令，例如。在Github自述文件中，读者会有问题......

表达具有客户端权限的相对路径的唯一方法是使用 STDIN ，

当指定STDIN或STDOUT时，数据通过客户端和服务器之间的连接传输。

为remembered here：

psql

Answer 4

我一直使用这个功能一段时间没有问题。您只需要提供csv文件中的数字列，它将从第一行获取标题名称并为您创建表格：

create or replace function data.load_csv_file
    (
        target_table  text, -- name of the table that will be created
        csv_file_path text,
        col_count     integer
    )

    returns void

as $$

declare
    iter      integer; -- dummy integer to iterate columns with
    col       text; -- to keep column names in each iteration
    col_first text; -- first column name, e.g., top left corner on a csv file or spreadsheet

begin
    set schema 'data';

    create table temp_table ();

    -- add just enough number of columns
    for iter in 1..col_count
    loop
        execute format ('alter table temp_table add column col_%s text;', iter);
    end loop;

    -- copy the data from csv file
    execute format ('copy temp_table from %L with delimiter '','' quote ''"'' csv ', csv_file_path);

    iter := 1;
    col_first := (select col_1
                  from temp_table
                  limit 1);

    -- update the column names based on the first row which has the column names
    for col in execute format ('select unnest(string_to_array(trim(temp_table::text, ''()''), '','')) from temp_table where col_1 = %L', col_first)
    loop
        execute format ('alter table temp_table rename column col_%s to %s', iter, col);
        iter := iter + 1;
    end loop;

    -- delete the columns row // using quote_ident or %I does not work here!?
    execute format ('delete from temp_table where %s = %L', col_first, col_first);

    -- change the temp table name to the name given as parameter, if not blank
    if length (target_table) > 0 then
        execute format ('alter table temp_table rename to %I', target_table);
    end if;
end;

$$ language plpgsql;

Answer 5

您可以使用d6tstack来为您创建表，faster than pd.to_sql()是因为它使用本机数据库导入命令。它支持Postgres以及MYSQL和MS SQL。

With cte as
(
select row_number() over (partition by ID order by ID) as rk,ID from table
)
select ID from cte where rk>1

对于导入多个CSV，解决数据模式更改和/或在写入db之前用熊猫进行预处理（例如日期），这也很有用，请参见examples notebook

import pandas as pd
df = pd.read_csv('table.csv')
uri_psql = 'postgresql+psycopg2://usr:pwd@localhost/db'
d6tstack.utils.pd_to_psql(df, uri_psql, 'table')

如何使用CSV文件中的标题从CSV文件复制到PostgreSQL表？

5 个答案:

未经许可的终端替代