我需要将任何 postgres表从生产数据库(源)的最新副本同步到开发人员数据库,而无需清除其测试数据。下面的简化代码适用于大多数表,但不适用于具有jsonb字段的表,这是由于 psycopg2.ProgrammingError:无法适应类型'dict'
import psycopg2
from psycopg2 import sql
tb = "table_to_be_copied"
##############################
# load data from source DB
##############################
conn_source = psycopg2.connect(host='source_localhost',
dbname=postgres,
user='xyz',
port=source_port)
cursor_source = conn_source.cursor()
cursor_source.execute(
sql.SQL("SELECT * from {}").format(sql.Identifier(tb))
)
# obtain column names on the fly for any given table
colnames = tuple([desc[0] for desc in cursor_source.description])
# jsonb's type code is 3802. This will help the program determine on the fly
# which columns are in jsonb.
typecodes = tuple([desc[1] for desc in cursor_source.description])
# obtain production data to be synced
rows = cursor_source.fetchall()
cursor_source.close()
conn_source.close()
##############################
# upsert data into destination DB
##############################
conn_dest = psycopg2.connect(host='dest_localhost',
dbname='postgres',
user='xyz',
port=dest_port)
cursor_dest = conn_dest.cursor()
for row in rows:
cursor_dest.execute(
sql.SQL("INSERT INTO {} ({}) VALUES ({}) \
ON CONFLICT (id) DO UPDATE SET ({}) = ({})").format(
sql.Identifier(tb),
sql.SQL(', ').join(map(sql.Identifier, colnames)),
sql.SQL(', ').join(sql.Placeholder() * len(colnames)),
sql.SQL(', ').join(map(sql.Identifier, colnames)),
sql.SQL(', ').join(sql.Placeholder() * len(colnames))),
row * 2)
conn_dest.commit()
cursor_dest.close()
conn_dest.close()
print ("Sync done")
如果可能的话,我不希望执行2个查询:一个UPSERT用于非jsonb字段,并且需要处理NOT NULL jsonb字段;另一个UPDATE用于jsonb类型转换。
谢谢。