我有一个数据框,可以从中提取值以编写对sql数据库的查询,并希望在该数据框上附加查询的信息。原始数据框将是这样的:
df =
ID YEAR CODE
43 2013 051
97 2015 087
...
我当前的代码是:
import pypyodbc as podbc
db = podbc.connect('Driver={SQL Server};Server=server;Database=database')
for row in df:
cursor = db.cursor()
query = '''
SELECT ID, Tool, Date, Version
FROM table
WHERE ID = '{id}'
AND Year(Date) = '{year}'
AND Code = '{code}'
'''.format(id = df.ID, year = df.YEAR, code = df.CODE)
cursor.execute(query)
rows = cursor.fetchall()
pd.DataFrame(rows, columns=[x[0] for x in cursor.description)
单个查询的返回值如下:
ID Tool Date Version
0 43 C15 22-05-2013 1.0
所以我现在的问题是
1. I don't know how to create an iterable query (for row in df)
2. I don't know how to relate the new dataframe that the query creates to the original df
希望结果会是这样的:
ID YEAR CODE Tool Date Version
43 2013 051 C15 22-05-2013 1.0
97 2015 087 C67 31-01-2015 2.0
答案 0 :(得分:0)
您已经在使用熊猫
尝试一下
conn = podbc.connect('Driver={SQL Server};Server=server;Database=database')
query = ''' SELECT ID, Tool, Date, Version
FROM table
WHERE ID = '{id}' AND
Year(Date) = '{year}' AND
Code = '{code}' '''.replace("\n","").format(id = df.ID, year = df.YEAR, code = df.CODE)
df = pd.read_sql(query,conn)
# For large datasets use chunks=chunk_size, low_memory=False
注意:请不要忘记在每行查询的末尾添加空格。