我有一个SQL Server,上面有要使用熊猫更改数据的数据库。我知道如何使用pyodbc将数据获取到DataFrame中,但是后来我不知道如何将该DataFrame重新获取到SQL Server中。
我试图用sqlalchemy创建一个引擎并使用to_sql
命令,但由于我的引擎永远无法正确连接到我的数据库,因此无法正常工作。
import pyodbc
import pandas
server = "server"
db = "db"
conn = pyodbc.connect('DRIVER={SQL Server};SERVER='+server+';DATABASE='+db+';Trusted_Connection=yes')
cursor = conn.cursor()
df = cursor.fetchall()
data = pandas.DataFrame(df)
conn.commit()
答案 0 :(得分:0)
您可以use pandas.DataFrame.to_sql将数据框插入SQL Server。此方法支持SQLAlchemy支持的数据库。
这是一个如何实现此目的的示例:
from sqlalchemy import create_engine, event
from urllib.parse import quote_plus
import logging
import sys
import numpy as np
from datetime import datetime, timedelta
# setup logging
logging.basicConfig(stream=sys.stdout,
filemode='a',
format='%(asctime)s.%(msecs)3d %(levelname)s:%(name)s: %(message)s',
datefmt='%m-%d-%Y %H:%M:%S',
level=logging.DEBUG)
logger = logging.getLogger(__name__) # get the name of the module
def write_to_db(df, database_name, table_name):
"""
Creates a sqlalchemy engine and write the dataframe to database
"""
# replacing infinity by nan
df = df.replace([np.inf, -np.inf], np.nan)
user_name = 'USERNAME'
pwd = 'PASSWORD'
db_addr = '10.00.000.10'
chunk_size = 40
conn = "DRIVER={SQL Server};SERVER="+db_addr+";DATABASE="+database_name+";UID="+user_name+";PWD="+pwd+""
quoted = quote_plus(conn)
new_con = 'mssql+pyodbc:///?odbc_connect={}'.format(quoted)
# create sqlalchemy engine
engine = create_engine(new_con)
# Write to DB
logger.info("Writing to database ...")
st = datetime.now() # start time
# WARNING!! -- overwrites the table using if_exists='replace'
df.to_sql(table_name, engine, if_exists='replace', index=False, chunksize=chunk_size)
logger.info("Database updated...")
logger.info("Data written to '{}' databsae into '{}' table ...".format(database_name, table_name))
logger.info("Time taken to write to DB: {}".format((datetime.now()-st).total_seconds()))
调用此方法会将您的数据帧写入数据库,请注意,如果数据库中已经存在相同名称的表,它将替换该表。