Question

我有一个这样的数据框

index  userID    OtherIDs
0   abcdef2035  [test650, test447, test968, test95]
1   abcdef3007  [test999, test992, test943, test834]
2   abcdef2006  [test175, test996, test986, test965]
3   abcdef2003  [test339, test968, test87, test678]
4   abcdef3000  [test129, test99, test921, test909]

生成此数据框的代码每天都会运行。我需要将其上载到现有数据库中的表名称“ result”。我必须检查表“结果”是否存在，如果存在，请使用上述数据框中的当前值删除/覆盖这些值。

postgres数据库信条：

PGHOST = 'localhost'
PGDATABASE = 'TestDB'
PGUSER = 'postgres'
PGPASSWORD = 'admin1234'

Answer 1

您可以使用SQLAlchemy：（https://docs.sqlalchemy.org/en/14/core/engines.html）

pandas df.to_sql：（https://pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.DataFrame.to_sql.html）

假设数据框名称为df

from sqlalchemy import create_engine
engine = create_engine(user:password@host_ip:port/postgres_database)
df.to_sql('results', schema='<schema_name>', con = engine, if_exists='replace')

只需以正确的格式传递您的凭据即可。即engine = user:password@host_ip:port/postgres_database

构造引擎字符串：假设以下sign_in变量：

sign_in = {
  "database": "TestDB",
  "user": "postgres",
  "password": "<your_password>",
  "host": "localhost",
  "port": "<your_port>"
}

signin_info = 'postgresql+pygresql://'+sign_in['user']+':'+sign_in['password']+'@'+sign_in['host']+':'+sign_in['port']+'/'+sign_in['database']

from sqlalchemy import create_engine
engine = create_engine(signin_info)

df.to_sql('results', schema='<schema_name>', con = engine, if_exists='replace')

如何将熊猫数据框插入现有的postgres sql数据库中？

1 个答案: