来自Python

时间:2018-05-02 13:29:48

标签: python sql-server pyodbc

我目前正在使用python实现下面的简单查询,使用pyodbc在SQL服务器表中插入数据:

import pyodbc

table_name = 'my_table'
insert_values = [(1,2,3),(2,2,4),(3,4,5)]

cnxn = pyodbc.connect(...)
cursor = cnxn.cursor()
cursor.execute(
    ' '.join([
        'insert into',
        table_name,
        'values',
        ','.join(
            [str(i) for i in insert_values]
        )
    ])
)
cursor.commit()

只要没有重复键(假设第一列包含键),这应该可以工作。但是对于具有重复键的数据(表中已存在数据),将引发错误。 我怎么能一次性使用pyodbc在SQL服务器表中插入多行,这样只需更新带有重复键的数据。

注意:针对单行数据提出了解决方案,但是,我想一次插入多行(避免循环)!

3 个答案:

答案 0 :(得分:6)

可以使用MERGE完成此操作。假设您有一个键列ID,以及两列col_acol_b(您需要在update语句中指定列名),那么语句将如下所示:

MERGE INTO MyTable as Target
USING (SELECT * FROM 
       (VALUES (1, 2, 3), (2, 2, 4), (3, 4, 5)) 
       AS s (ID, col_a, col_b)
      ) AS Source
ON Target.ID=Source.ID
WHEN NOT MATCHED THEN
INSERT (ID, col_a, col_b) VALUES (Source.ID, Source.col_a, Source.col_b)
WHEN MATCHED THEN
UPDATE SET col_a=Source.col_a, col_b=Source.col_b;

您可以尝试rextester.com/IONFW62765

基本上,我正在创建一个Source表&#34;即时&#34;使用您想要upsert的值列表。然后,当您将Source表与Target合并后,您可以在每一行上测试MATCHED条件(Target.ID=Source.ID)(而当您排查时,您将仅限于一行)只使用简单的IF <exists> INSERT (...) ELSE UPDATE (...)条件。)

在使用pyodbc的python中,它应该看起来像这样:

import pyodbc

insert_values = [(1, 2, 3), (2, 2, 4), (3, 4, 5)]
table_name = 'my_table'
key_col = 'ID'
col_a = 'col_a'
col_b = 'col_b'

cnxn = pyodbc.connect(...)
cursor = cnxn.cursor()
cursor.execute(('MERGE INTO {table_name} as Target '
                'USING (SELECT * FROM '
                '(VALUES {vals}) '
                'AS s ({k}, {a}, {b}) '
                ') AS Source '
                'ON Target.ID=Source.ID '
                'WHEN NOT MATCHED THEN '
                'INSERT ({k}, {a}, {b}) VALUES (Source.{k}, Source.{a}, Source.{b}) '
                'WHEN MATCHED THEN '
                'UPDATE SET {k}=Source.{a}, col_b=Source.{b};'
                .format(table_name=table_name,
                        vals=','.join([str(i) for i in insert_values]),
                        k=key_col,
                        a=col_a,
                        b=col_b)))
cursor.commit()

您可以在SQL Server docs中的MERGE上阅读更多内容。

答案 1 :(得分:2)

在此处遵循现有的答案,因为它们很可能会受到注入攻击,并且最好使用参数化查询(对于mssql / pyodbc,这些是“?”占位符)。我略微调整了Alexander Novas的代码,以在带有sqlalchemy的查询的参数化版本中使用数据框行:

# assuming you already have a dataframe "df" and sqlalchemy engine called "engine"
# also assumes your dataframe columns have all the same names as the existing table

table_name_to_update = 'update_table'
table_name_to_transfer = 'placeholder_table'

# the dataframe and existing table should both have a column to use as the primary key
primary_key_col = 'id'

# replace the placeholder table with the dataframe
df.to_sql(table_name_to_transfer, engine, if_exists='replace', index=False)

# building the command terms
cols_list = df.columns.tolist()
cols_list_query = f'({(", ".join(cols_list))})'
sr_cols_list = [f'Source.{i}' for i in cols_list]
sr_cols_list_query = f'({(", ".join(sr_cols_list))})'
up_cols_list = [f'{i}=Source.{i}' for i in cols_list]
up_cols_list_query = f'{", ".join(up_cols_list)}'
    
# fill values that should be interpreted as "NULL" with None
def fill_null(vals: list) -> list:
    def bad(val):
        if isinstance(val, type(pd.NA)):
            return True
        # the list of values you want to interpret as 'NULL' should be 
        # tweaked to your needs
        return val in ['NULL', np.nan, 'nan', '', '', '-', '?']
    return tuple(i if not bad(i) else None for i in vals)

# create the list of parameter indicators (?, ?, ?, etc...)
# and the parameters, which are the values to be inserted
params = [fill_null(row.tolist()) for _, row in df.iterrows()]
param_slots = '('+', '.join(['?']*len(df.columns))+')'
    
cmd = f'''
       MERGE INTO {table_name_to_update} as Target
       USING (SELECT * FROM
       (VALUES {param_slots})
       AS s {cols_list_query}
       ) AS Source
       ON Target.{primary_key_col}=Source.{primary_key_col}
       WHEN NOT MATCHED THEN
       INSERT {cols_list_query} VALUES {sr_cols_list_query} 
       WHEN MATCHED THEN
       UPDATE SET {up_cols_list_query};
       '''

# execute the command to merge tables
with engine.begin() as conn:
    conn.execute(cmd, params)

如果您要插入带有与SQL插入文本不兼容的字符的字符串(例如使插入语句弄乱的撇号),则此方法也更好,因为它可以让连接引擎处理参数化的值(这也使得它可以更安全地抵御SQL注入攻击。

作为参考,我正在使用此代码创建引擎连接-您显然需要使其适应服务器/数据库/环境以及是否要使用fast_executemany

import urllib
import pyodbc
pyodbc.pooling = False
import sqlalchemy

terms = urllib.parse.quote_plus(
            'DRIVER={SQL Server Native Client 11.0};'
            'SERVER=<your server>;'
            'DATABASE=<your database>;'
            'Trusted_Connection=yes;' # to logon using Windows credentials

url = f'mssql+pyodbc:///?odbc_connect={terms}'
engine = sqlalchemy.create_engine(url, fast_executemany=True)

编辑:我意识到这段代码实际上根本没有使用“占位符”表,而只是通过参数化命令直接从数据帧行中复制值。

答案 2 :(得分:0)

给出一个数据框(df),我使用了ksbg中的代码来向上插入表中。请注意,我在两列(日期和站号)中寻找匹配项,您可以使用其中一列。给定任何df的代码都会生成查询。

return (
    <>
      {marketEstimateDataBCAssets.map((item, key) => (
        <div key={item.name}>
          {/* let's name is unique */}
          <div> {item.name}</div>
          <div> {item.prevgroupinputrate}</div>
          <div> {item.currgroupinputrate}</div>
          <input
            value={item.mktratedelta}
            onChange={e => {
              const newArr = marketEstimateDataBCAssets.map(el => {
                if (el.name === item.name) {
                  return {
                    ...el,
                    mktratedelta: parseFloat(e.target.value),
                    mktrateestimate: (
                      parseFloat(e.target.value) + item.currgroupinputrate
                    ).toFixed(4)
                  };
                }
                return el;
              });
              console.log(newArr);

              return setmarketEstimateData([...newArr]);
            }}
          />
          <div> {item.mktrateestimate}</div>}
        </div>
      ))}
    </>
  );