我正在尝试使用PyAthenaJDBC库创建一个python脚本,该脚本将使用Athena查询S3存储桶。该库非常棒,但我在格式化方面存在问题。
我在单独的函数中将查询构造为字符串,并将查询字符串传递回cursor.execute(query)
。
查询字符串中包含引号,如下所示:
SELECT day, elb_name
,COUNT (*) AS c
,100.0 * (
1.0 - (
SUM (
CASE
WHEN elb_response_code LIKE '5%' THEN 1
ELSE 0
END
) / cast(COUNT (*) as double)
)
) AS success_rate
,100.0 * SUM (
CASE
WHEN backend_processing_time < 0.1 THEN 1
ELSE 0
END
) / cast(COUNT (*) as double) AS t_lt_pt1
,100.0 * SUM (
CASE
WHEN backend_processing_time < 1 THEN 1
ELSE 0
END
) / cast(COUNT (*) as double) AS t_lt_1
,100.0 * SUM (
CASE
WHEN backend_processing_time < 5 THEN 1
ELSE 0
END
) / cast(COUNT (*) as double) AS t_lt_5
,100.0 * SUM (
CASE
WHEN backend_processing_time < 10 THEN 1
ELSE 0
END
) / cast(COUNT (*) as double) AS t_lt_10
FROM elb_logs_raw_native_part
WHERE year = '2017' AND
month = '03' AND
elb_name is not NULL AND
elb_name != ''
GROUP BY day, elb_name
ORDER BY c DESC
这将导致`LIKE'5%'语句中的第二个单引号出错。
ValueError: unsupported format character ''' (0x27) at index 186
我可以通过更改库https://github.com/laughingman7743/PyAthenaJDBC/blob/master/pyathenajdbc/formatter.py#L115
中的这一行来避免错误并成功执行查询从return (operation % kwargs).strip()
到return (operation).strip()
代码中的这一点,operation == query
(粘贴在上面)和kwargs == {}
我的具体问题是,我是否错误地构建了我的查询?或者这是我不明白的本地字符串格式,其中尝试用空字典替换是一个坏主意?
答案 0 :(得分:1)
如果查询用作要格式化的字符串,则需要将要保留的任何%
字符加倍为文字%
字符。
而不是WHEN elb_response_code LIKE '5%' THEN 1
,请使用WHEN elb_response_code LIKE '5%%' THEN 1
。在您在formatter.py中指定的行后,%%
将转换为单个%
。