我有一个如下所示的数据框
df = pd.DataFrame({
'R_Id':[11,21,3,14,51,22],
'R_name' : ['READ_11','READ_21','READ_3','READ_14','READ_51','READ_22']
})
请注意,df仅包含唯一的ID
这是我一直在手动编写的查询
select person_id,
count(*) filter (where reading = 11) as cnt_read_11,
min(value) filter (where reading = 11) as min_read_11,
max(value) filter (where reading = 11) as max_read_11,
avg(value) filter (where reading = 11) as avg_read_11,
stddev(value) filter (where reading = 11) as stdev_read_11,
count(*) filter (where reading = 21) as cnt_read_21,
min(value) filter (where reading = 21) as min_read_21,
max(value) filter (where reading = 21) as max_read_21,
avg(value) filter (where reading = 21) as avg_read_21,
stddev(value) filter (where reading = 21) as stdev_read_21,
from table
group by person_id;
如您所见,模板遵循三个规则
a)每次阅读都会有5句话(count,min,max,avg,stddev
)
b)从df中获取R_Id
并将其放在where
子句中
c)从df中获取R_name
,并将其放在每个列名的末尾。例如:cnt_read_11
,min_read_11
等
您能帮我实现这一点并为df中存在的所有读数生成查询吗??
答案 0 :(得分:1)
您可以使用"""
指定模板,并使用itertuples
在f字符串中设置值:
head = """
select person_id,"""
tail="""from table
group by person_id;
"""
out = []
for t in df.itertuples():
temp = f"""
count(*) filter (where reading = {t.R_Id}) as cnt_{t.R_name},
min(value) filter (where reading = {t.R_Id}) as min_{t.R_name},
max(value) filter (where reading = {t.R_Id}) as max_{t.R_name},
avg(value) filter (where reading = {t.R_Id}) as avg_{t.R_name},
stddev(value) filter (where reading = {t.R_Id}) as stdev_{t.R_name},
"""
out.append(temp)
fin = head + ''.join(out) + tail
print (fin)
select person_id,
count(*) filter (where reading = 11) as cnt_READ_11,
min(value) filter (where reading = 11) as min_READ_11,
max(value) filter (where reading = 11) as max_READ_11,
avg(value) filter (where reading = 11) as avg_READ_11,
stddev(value) filter (where reading = 11) as stdev_READ_11,
count(*) filter (where reading = 21) as cnt_READ_21,
min(value) filter (where reading = 21) as min_READ_21,
max(value) filter (where reading = 21) as max_READ_21,
avg(value) filter (where reading = 21) as avg_READ_21,
stddev(value) filter (where reading = 21) as stdev_READ_21,
count(*) filter (where reading = 3) as cnt_READ_3,
min(value) filter (where reading = 3) as min_READ_3,
max(value) filter (where reading = 3) as max_READ_3,
avg(value) filter (where reading = 3) as avg_READ_3,
stddev(value) filter (where reading = 3) as stdev_READ_3,
count(*) filter (where reading = 14) as cnt_READ_14,
min(value) filter (where reading = 14) as min_READ_14,
max(value) filter (where reading = 14) as max_READ_14,
avg(value) filter (where reading = 14) as avg_READ_14,
stddev(value) filter (where reading = 14) as stdev_READ_14,
count(*) filter (where reading = 51) as cnt_READ_51,
min(value) filter (where reading = 51) as min_READ_51,
max(value) filter (where reading = 51) as max_READ_51,
avg(value) filter (where reading = 51) as avg_READ_51,
stddev(value) filter (where reading = 51) as stdev_READ_51,
count(*) filter (where reading = 22) as cnt_READ_22,
min(value) filter (where reading = 22) as min_READ_22,
max(value) filter (where reading = 22) as max_READ_22,
avg(value) filter (where reading = 22) as avg_READ_22,
stddev(value) filter (where reading = 22) as stdev_READ_22,
from table
group by person_id;