使用数据框值查询SQL数据库并将其附加回数据框

时间:2019-05-27 17:22:49

标签: python sql pandas

我有一个数据框,可以从中提取值以编写对sql数据库的查询,并希望在该数据框上附加查询的信息。原始数据框将是这样的:

df =

ID YEAR CODE 
43 2013 051
97 2015 087
...

我当前的代码是:

import pypyodbc as podbc

db = podbc.connect('Driver={SQL Server};Server=server;Database=database')

for row in df:
    cursor = db.cursor()
    query = '''
    SELECT ID, Tool, Date, Version
    FROM table
    WHERE ID = '{id}'
    AND Year(Date) = '{year}'
    AND Code = '{code}'
    '''.format(id = df.ID, year = df.YEAR, code = df.CODE)

    cursor.execute(query)
    rows = cursor.fetchall()
    pd.DataFrame(rows, columns=[x[0] for x in cursor.description)

单个查询的返回值如下:

    ID   Tool   Date         Version
0   43   C15    22-05-2013   1.0

所以我现在的问题是

1. I don't know how to create an iterable query (for row in df)
2. I don't know how to relate the new dataframe that the query creates to the original df

希望结果会是这样的:

ID YEAR CODE Tool Date        Version
43 2013 051  C15  22-05-2013  1.0
97 2015 087  C67  31-01-2015  2.0

1 个答案:

答案 0 :(得分:0)

您已经在使用熊猫

尝试一下

conn = podbc.connect('Driver={SQL Server};Server=server;Database=database')

query = ''' SELECT ID, Tool, Date, Version 
            FROM table 
            WHERE ID = '{id}' AND 
            Year(Date) = '{year}' AND 
            Code = '{code}' '''.replace("\n","").format(id = df.ID, year = df.YEAR, code = df.CODE)

df = pd.read_sql(query,conn)
    # For large datasets use chunks=chunk_size, low_memory=False

注意:请不要忘记在每行查询的末尾添加空格。