Python脚本中JOIN查询返回的行数不正确

时间:2015-08-28 10:51:45

标签: python join sqlite

我在Python中编写了以下代码,连接到DB,创建了两个表并加入它们。然后它打印JOIN查询的结果。

问题是行数是3,但我希望得到2行。此外,如果我使用sqlite>在命令提示符中运行相同的查询,则JOIN返回的行数是正确的,即2。

import sqlite3 as lite
import pandas as pd

# Connecting to the database. The `connect()` method returns a connection object.
con = lite.connect('getting_started.db')

with con:
    cur = con.cursor()
    cur.execute("DROP TABLE IF EXISTS cities")
    cur.execute("DROP TABLE IF EXISTS weather")

    cur.execute("CREATE TABLE cities (name text, state text)")
    cur.execute("CREATE TABLE weather (city text, year integer, warm_month text, cold_month text, average_high integer)")

    # Filling 'cities' with the data
    cur.execute("INSERT INTO cities VALUES('Washington', 'DC')")
    cur.execute("INSERT INTO cities VALUES('Houston', 'TX')")

    # Filling 'weather' with the data
    cur.execute("INSERT INTO weather VALUES('Washington', 2013, 'July', 'January', 59)")
    cur.execute("INSERT INTO weather VALUES('Houston', 2013, 'July', 'January', 62)")

    # Joining data together
    sql = "SELECT name, state, year, warm_month, cold_month FROM cities " \
          "INNER JOIN weather " \
          "ON name = city"
    cur.execute(sql)

rows = cur.fetchall()
cols = [desc[0] for desc in cur.description]

# Loading data into pandas
df = pd.DataFrame(rows, columns=cols)

for index, row in df.iterrows():
    print("City: {0}, The warmest month: {1}".format(row['name'],row['warm_month']))

在Python中,结果是:

City: Washington, The warmest month: July
City: Washington, The warmest month: July
City: Houston, The warmest month: July

但是,在命令提示符下,结果不同(正确):

City: Washington, The warmest month: July
City: Houston, The warmest month: July

1 个答案:

答案 0 :(得分:2)

问题是您的rows = cur.fetchall()con连接上下文管理器之外,因此当您使用游标并且它的数据库连接已关闭时,会发生奇怪的事情。

在这里参考文档:https://docs.python.org/2/library/sqlite3.html#using-the-connection-as-a-context-manager它表明with con:提供了一个事务,这可能解释了在事务中执行语句但在外面尝试使用游标的奇怪行为交易。

对我来说这似乎仍然很奇怪,我原本预计这种使用会导致sqllite3引发异常,告诉你这种情况发生了。