此代码获取数据,并将数据放入一个循环,直到循环完成为止。
因此,我需要将数据附加到在每个过程完成后存储数据的df中
代码:
a = "SELECT id FROM USER WHERE time >'2018-03-01'"
dataa = pd.read_sql_query(a, con=engine)
print(dataa)
for userid in dataa:
x=f"SELECT idbody FROM col1 WHERE user_id='{userid}'"
data = pd.read_sql_query(x,con = engine)
所以这里要处理的数据和每次生成的数据都是不同的,需要将数据附加到存储所有已处理数据的df中
答案 0 :(得分:1)
我假设您获得相同数量的列,并且这些列具有相同的名称。 例如这是基本思想:
df = pd.DataFrame() # this will hold your all data
df1 = pd.DataFrame([(1, 2, 3)], columns=['a', 'b', 'c']) # 1st iteration data
df2 = pd.DataFrame([(11, 22, 33)], columns=['a', 'b', 'c']) # 2nd iteration data
df3 = pd.DataFrame([(111, 222, 333)], columns=['a', 'b', 'c']) # 3rd iteratin data etc.
for data in [df1, df2, df3]:
df = df.append(df1)
a b c
0 1 2 3
1 11 22 33
2 111 222 333
您需要做的是:
a = "SELECT id FROM USER WHERE time >'2018-03-01'"
dataa = pd.read_sql_query(a, con=engine)
print(dataa)
df_all = pd.DataFrame() # create an empty df to store all returns
for userid in dataa:
x=f"SELECT idbody FROM col1 WHERE user_id='{userid}'"
data = pd.read_sql_query(x,con = engine)
df_all = df_all.append(data) # update df with new dframes
答案 1 :(得分:1)
您也可以使用concat
:
a = "SELECT id FROM USER WHERE time >'2018-03-01'"
dataa = pd.read_sql_query(a, con=engine)
print(dataa)
df = pd.DataFrame()
for userid in dataa:
x=f"SELECT idbody FROM col1 WHERE user_id='{userid}'"
data = pd.read_sql_query(x,con = engine)
df = pd.concat([df_all, data])
现在:
print(df)
将是所需的输出。
答案 2 :(得分:1)
循环或按列表理解将值追加到list
,并且仅使用concat
一次:
a = "SELECT id FROM USER WHERE time >'2018-03-01'"
dataa = pd.read_sql_query(a, con=engine)
dfs = []
for userid in dataa:
x=f"SELECT idbody FROM col1 WHERE user_id='{userid}'"
data = pd.read_sql_query(x,con = engine)
dfs.append(data)
df = pd.concat(dfs, ignore_index=True)
dfs = [pd.read_sql_query(f"SELECT idbody FROM col1 WHERE user_id='{userid}'",con = engine)
for userid in dataa]
df = pd.concat(dfs, ignore_index=True)
答案 3 :(得分:1)
另一种方法,而不是循环,为什么不将所有userid
连接到一个字符串中,并使用SQL IN
语句对数据库进行一次调用:
a = "SELECT id FROM USER WHERE time >'2018-03-01'"
dataa = pd.read_sql_query(a, con=engine)
userids = ', '.join([f'"{x}"' for x in dataa['id'].astype(str).values])
x = f"SELECT idbody FROM col1 WHERE user_id IN ({userids})"
data = pd.read_sql_query(x,con = engine)
dataa = pd.DataFrame({'id': ['123', '124', '125', '126']})
userids = ', '.join([f'"{x}"' for x in dataa['id'].astype(str).values])
x = f"SELECT idbody FROM col1 WHERE user_id IN ({userids})"
print(x)
[出]
# SELECT idbody FROM col1 WHERE user_id IN ("123", "124", "125", "126")