如何基于模板生成查询?

时间:2019-10-17 09:58:19

标签: python python-3.x pandas dataframe automation

我有一个如下所示的数据框

df = pd.DataFrame({
'R_Id':[11,21,3,14,51,22],
'R_name' : ['READ_11','READ_21','READ_3','READ_14','READ_51','READ_22']
})

enter image description here

请注意,df仅包含唯一的ID

这是我一直在手动编写的查询

select person_id,
   count(*) filter (where reading = 11) as cnt_read_11,
   min(value) filter (where reading = 11) as min_read_11,
   max(value) filter (where reading = 11) as max_read_11,
   avg(value) filter (where reading = 11) as avg_read_11,
   stddev(value) filter (where reading = 11) as stdev_read_11,
   count(*) filter (where reading = 21) as cnt_read_21,
   min(value) filter (where reading = 21) as min_read_21,
   max(value) filter (where reading = 21) as max_read_21,
   avg(value) filter (where reading = 21) as avg_read_21,
   stddev(value) filter (where reading = 21) as stdev_read_21,
   from table
   group by person_id;

如您所见,模板遵循三个规则

a)每次阅读都会有5句话(count,min,max,avg,stddev

b)从df中获取R_Id并将其放在where子句中

c)从df中获取R_name,并将其放在每个列名的末尾。例如:cnt_read_11min_read_11

您能帮我实现这一点并为df中存在的所有读数生成查询吗?

1 个答案:

答案 0 :(得分:1)

您可以使用"""指定模板,并使用itertuples在f字符串中设置值:

head = """
select person_id,"""

tail="""from table
    group by person_id;

"""

out = []
for t in df.itertuples():
    temp = f"""
       count(*) filter (where reading = {t.R_Id}) as cnt_{t.R_name},
       min(value) filter (where reading = {t.R_Id}) as min_{t.R_name},
       max(value) filter (where reading = {t.R_Id}) as max_{t.R_name},
       avg(value) filter (where reading = {t.R_Id}) as avg_{t.R_name},
       stddev(value) filter (where reading = {t.R_Id}) as stdev_{t.R_name},
    """
    out.append(temp)

fin = head + ''.join(out) + tail
print (fin)

select person_id,
       count(*) filter (where reading = 11) as cnt_READ_11,
       min(value) filter (where reading = 11) as min_READ_11,
       max(value) filter (where reading = 11) as max_READ_11,
       avg(value) filter (where reading = 11) as avg_READ_11,
       stddev(value) filter (where reading = 11) as stdev_READ_11,

       count(*) filter (where reading = 21) as cnt_READ_21,
       min(value) filter (where reading = 21) as min_READ_21,
       max(value) filter (where reading = 21) as max_READ_21,
       avg(value) filter (where reading = 21) as avg_READ_21,
       stddev(value) filter (where reading = 21) as stdev_READ_21,

       count(*) filter (where reading = 3) as cnt_READ_3,
       min(value) filter (where reading = 3) as min_READ_3,
       max(value) filter (where reading = 3) as max_READ_3,
       avg(value) filter (where reading = 3) as avg_READ_3,
       stddev(value) filter (where reading = 3) as stdev_READ_3,

       count(*) filter (where reading = 14) as cnt_READ_14,
       min(value) filter (where reading = 14) as min_READ_14,
       max(value) filter (where reading = 14) as max_READ_14,
       avg(value) filter (where reading = 14) as avg_READ_14,
       stddev(value) filter (where reading = 14) as stdev_READ_14,

       count(*) filter (where reading = 51) as cnt_READ_51,
       min(value) filter (where reading = 51) as min_READ_51,
       max(value) filter (where reading = 51) as max_READ_51,
       avg(value) filter (where reading = 51) as avg_READ_51,
       stddev(value) filter (where reading = 51) as stdev_READ_51,

       count(*) filter (where reading = 22) as cnt_READ_22,
       min(value) filter (where reading = 22) as min_READ_22,
       max(value) filter (where reading = 22) as max_READ_22,
       avg(value) filter (where reading = 22) as avg_READ_22,
       stddev(value) filter (where reading = 22) as stdev_READ_22,
    from table
    group by person_id;