Question

我在Postgresql数据库中有一个表，该表具有列name和email，name列已被填充，但是email列为空。我还有一个CSV文件，其中填充了列user和user_email。如何使用CSV文件更新email列以匹配user和name，以及如何用email更新正确的user_email？

我一直在寻找答案或阅读一些教程，但是我不确定如何措辞该问题，所以我找不到任何答案。

这是我的Postgres表现在的样子：

| name     | email    |
| joh      |          |
| alex     |          |
| adams    |          |

这是我的CSV文件：

| user     | user_eamil|
| joh      | a@g.com  |
| alex     | a@g.com  |
| adams    | a@g.com  |

Answer 1

您需要创建一个中间表（也称为“登台表”）。

然后将CSV文件导入该表。之后，您可以从导入的数据更新目标表：

update target_table
   set email = t.user_eamil
from staging_table st
where st."user" = target_table.name;

这假设name在目标表中是唯一的，并且每个用户在输入文件中仅出现一次。

Answer 2

这是使用Python和SQLAlchemy的一种方法。此示例使用MySQL连接。

首先将csv读取到pandas DataFrame中。然后reflect在数据库中创建要更新的表。反映数据库对象将从数据库中已经存在的相应数据库模式对象中加载有关自身的信息。最后，您可以遍历DataFrame（使用iterrows()）并将update语句传递到表中。

import pymysql
pymysql.install_as_MySQLdb()
from sqlalchemy.ext.automap import automap_base
from sqlalchemy import create_engine, update
from sqlalchemy.sql import and_
import pandas as pd

# Read the csv containing emails to be updated
emails_to_be_updated = pd.read_csv('~/Path/To/Your/file.csv', encoding = "utf8")

connection_string = "connection:string:to:yourdb"
engine = create_engine(connection_string, echo=False)

# SQLAlchemy: Reflect the tables
Base = automap_base()
Base.prepare(engine, reflect=True)

# Mapped classes are now created with names by default matching that of the table name.
Database_Table = Base.classes.name_email_table

# Iterate through rows that need to be updated
for index, row in emails_to_be_updated.iterrows():
    update_statement = update(Database_Table).where(and_(Database_Table.name == row['user'],
                                                         Database_Table.email == row['user_email']))

如何使用匹配值使用CSV文件中的数据更新Postgres中的列

2 个答案: