我正在尝试使用Jupyter Notebook将SQL查询提取到Pandas数据框。
我关注了these instruction from beardc:
import pandas as pd
df = pd.read_sql(sql, cnxn)
cnxn = pyodbc.connect(connection_info)
cursor = cnxn.cursor()
sql = """SELECT * FROM AdventureWorks2012.Person.Address
WHERE City = 'Bothell'
ORDER BY AddressID ASC"""
df = psql.frame_query(sql, cnxn)
cnxn.close()
但是,每当我运行代码时,它就会显示:
NameError
Traceback (most recent call last)
<ipython-input-5-4ea4efb152fe> in <module>()
1 import pandas as pd
2
3 df = pd.read_sql(sql, cnxn)
4
5 cnxn = pyodbc.connect(connection_info)
NameError: name 'sql' is not defined
我正在使用受监控的网络(如果有人要求,则使用公司网络)。
我想问一些问题:
connection_info
更改为数据库中的信息?我正在使用最新的Anaconda发行版。
答案 0 :(得分:1)
您收到的错误是由代码的订单引起的:
1 import pandas as pd
2 df = pd.read_sql(sql, cnxn) ## You call the variable sql here, but don't assign it until line 6
3
4 cnxn = pyodbc.connect(connection_info)
5 cursor = cnxn.cursor()
6 sql = """SELECT * FROM AdventureWorks2012.Person.Address
7 WHERE City = 'Bothell'
8 ORDER BY AddressID ASC"""
9 df = psql.frame_query(sql, cnxn)
10 cnxn.close()
sql
,但实际上直到第6行才定义变量。尝试按以下方式排列代码:
(请注意,此代码未经测试,其他问题如下所述)
#Import the libraries
import pandas as pd
import pyodbc
#Give the connection info
cnxn = pyodbc.connect(connection_info)
#Assign the SQL query to a variable
sql = "SELECT * FROM AdventureWorks2012.Person.Address WHERE City = 'Bothell' ORDER BY AddressID ASC"
#Read the SQL to a Pandas dataframe
df = pd.read_sql(sql, cnxn)
回答您的问题: