Postgres用Python

时间:2018-06-06 17:19:19

标签: python performance psycopg2

我有一段我目前不理解的代码,我不确定它是否按预期运行。如果有人可以提供有关如何重构此代码的一些信息,我们将不胜感激。

conn = psycopg2.connect(conn_string) cur = conn.cursor()
query = """ 
        SELECT 
          users.id, 
          users.company_id, 
          users.state, 
          users.phase, 
          transactions.amount 
        FROM 
          cases 
          INNER JOIN transactions ON user.id = transactions.user_id
        WHERE users.company_id IN (9, 15) 
        AND transactions.date < '2018-01-01' 
"""

fields = ['id','company_id', 'state', 'phase','amount']

cur.execute(query) data = cur.fetchall()
internal = []
for d in data: 
    internal += [ [''] * 4 ] 
    internal[-1][0] = d[1] 
    internal[-1][1] = d[fields.index('phase')] == 'activated' 
    internal[-1][2] = d[fields.index('state')] 
    internal[-1][3] = 'success' if d[4] > 0 else 'fail'

print(internal)

1 个答案:

答案 0 :(得分:0)

这将引导您逐步完成代码,注释总是适用于他们后面的行解释

# Imports and connection string setup
# ...

# You create a connection object and a cursor here, which are prerequisites to 
# executing queries.
conn = psycopg2.connect(conn_string)
cur = conn.cursor()

# This is your query, however this is static and you would have to manipulate 
# this query every time you want something different.
# If you want to dynamically change stuff look at the links 
# posted below the code.

query = """
        SELECT
            users.id,
            users.company_id,
            users.state,
            users.phase,
            transactions.amount
        FROM
            cases
            INNER JOIN transactions ON user.id = transactions.user_id
        WHERE users.company_id IN (9, 15)
        AND transactions.date < '2018-01-01'
"""

# The following line is unnecessary, although it may help understand what
# these columns are named
# To get all column names from the DB you can use
#   colnames = [desc[0] for desc in curs.description]
# After you executed your query below
fields = ['id', 'company_id', 'state', 'phase', 'amount']

cur.execute(query)

# Here you fetch all data, which can be a lot (you may want to check
# memory consumption)
data = cur.fetchall()

# Setting up a result list and iterate over the results of the query
internal = []
for d in data:
    # The next line is not needed, we do not need to append an empty list
    # to the results as we can just append the final result for this row
    # further += is overloading the operator, it is cleaner (IMHO) to
    # use .append() here
    internal += [[''] * 4]
    # Now the remainder in the loop is interesting, we add an empty list
    # of values to the end of the results (internal) and then use the last
    # index to update the values
    # Apart from printing the full results in the last step this will probably
    # incur some overhead compared to creating the single row's result first
    # and then appending/printing it

    # This simply assigns the value of the current rows index 1 to the last
    # item in internal (the empty list added before)
    internal[-1][0] = d[1]

    # Here we obtain the index of 'phase' from fields to know which column
    # the value is

    # If you never change the fields variable above, your can just simply index
    # it directly.

    # After knowing the index (fields.index('phase')) which would be 3 we do a
    # comparison if it is 'activated' as a value
    # So the result will be True or False for everything that is
    # not 'activated'
    internal[-1][1] = d[fields.index('phase')] == 'activated'
    # Here we simply find the column idx for state (which is 2) and add the
    # value.
    internal[-1][2] = d[fields.index('state')]
    # Here we check if the amount is > 0 and assign 'success' if so, otherwise
    # 'fail'.
    internal[-1][3] = 'success' if d[4] > 0 else 'fail'

# You are printing ALL results here, for simple check you may want to print
# only the first or last result as such:

# print(internal[0])
# or
# print(internal[-1])

print(internal)

[编辑]在此处添加重构结果“解析”(这只是其中一个选项)

conn = psycopg2.connect(conn_string)
cur = conn.cursor()

query = """
        SELECT
            users.id,
            users.company_id,
            users.state,
            users.phase,
            transactions.amount
        FROM
            cases
            INNER JOIN transactions ON user.id = transactions.user_id
        WHERE users.company_id IN (9, 15)
        AND transactions.date < '2018-01-01'
"""

fields = ['id', 'company_id', 'state', 'phase', 'amount']

cur.execute(query)
data = cur.fetchall()

internal = []
for d in data:
    # We simply create a list with all the same results and append it
    row = [
        d[1],
        d[fields.index('phase')] == 'activated',
        d[fields.index('state')],
        'success' if d[4] > 0 else 'fail'
    ]
    internal.append(row)

# print(internal)
print(internal[0])

可以对循环数据进行进一步的简化(从某种意义上说,使其更易于理解),但这些不会严重提高性能。

Postgres Python Query Parameters