为了设置监视,我试图获取特定文件的调用持续时间,将这些数据聚合并将其提供给grafana,但是即使我仔细检查了所有内容,我的查询仍会由于一个奇怪的原因而失败。可以。
在我的redshift数据库中,我有一个传说中的prodloadtimes,其中resource列是该数组中所有调用(网络日志)的JSON数组,我得到了我选择的基于此代码的调用:
select created,
useragent,
position('"pathname":"/api/player/screenName' in resource) as pos1,
right( resource, len(resource) - pos1) as pathname_start_string,
position(',{' in pathname_start_string) as pos2,
left (pathname_start_string, pos2-1) as pathname_tail,
concat(left (resource, pos1 + pos2 - 1), ']') as string3,
json_array_length(string3) as arr_len,
json_extract_array_element_text ( string3 , arr_len-1) as commons_element,
is_valid_json(commons_element) as "test"
from prodloadtimes
where resource like '%"pathname":"/api/player/screenName"%'
and created between '2018-12-01' and '2018-12-18 12:00'
and pos2 > 0
因此,基本上,我在数组中找到了一个需要的元素,切出了所有接下来的元素,所以我需要的是最后一个元素,然后使用函数json_extract_array_element_text (string3, arr_len-1)
现在,我使用以下代码提取所需的数据:
select created,
useragent,
json_extract_path_text(commons_element,'duration') as duration,
json_extract_path_text(commons_element,'transferSize') as transfersize,
json_extract_path_text(commons_element,'name', 'pathname') as pathname,
json_extract_path_text(commons_element,'encodedBodySize') as size,
commons_element,
is_valid_json(commons_element) as "test"
from (
select created,
useragent,
position('"pathname":"/api/player/screenName' in resource) as pos1,
right( resource, len(resource) - pos1) as pathname_start_string,
position(',{' in pathname_start_string) as pos2,
left (pathname_start_string, pos2-1) as pathname_tail,
concat(left (resource, pos1 + pos2 - 1), ']') as string3,
json_array_length(string3) as arr_len,
json_extract_array_element_text ( string3 , arr_len-1) as commons_element,
is_valid_json(commons_element) as "test"
from prodloadtimes
where resource like '%"pathname":"/api/player/screenName"%'
and created between '2018-12-01' and '2018-12-18 12:00'
and pos2 > 0
)
where transfersize > 0
and duration > 0)
它工作得很好,但是当我尝试使用此方法对数据进行百分位数时:
select floor(extract(epoch from created)/900)*900 AS "time",
percentile_cont(0.5) within group (order by cast(duration as float4) asc) AS "50th",
percentile_cont(0.75) within group (order by cast(duration as float4) asc) AS "75th",
percentile_cont(0.95) within group (order by cast(duration as float4) asc) AS "95th"
from (
select created,
useragent,
json_extract_path_text(commons_element,'duration') as duration,
json_extract_path_text(commons_element,'transferSize') as transfersize,
json_extract_path_text(commons_element,'name', 'pathname') as pathname,
json_extract_path_text(commons_element,'encodedBodySize') as size,
commons_element,
is_valid_json(commons_element) as "test"
from (
select created,
useragent,
position('"pathname":"/api/player/screenName' in resource) as pos1,
right( resource, len(resource) - pos1) as pathname_start_string,
position(',{' in pathname_start_string) as pos2,
left (pathname_start_string, pos2-1) as pathname_tail,
concat(left (resource, pos1 + pos2 - 1), ']') as string3,
json_array_length(string3) as arr_len,
json_extract_array_element_text ( string3 , arr_len-1) as commons_element,
is_valid_json(commons_element) as "test"
from prodloadtimes
where resource like '%"pathname":"/api/player/screenName"%'
and created between '2018-12-01' and '2018-12-18 12:00'
and pos2 > 0
)
where transfersize > 0
and duration > 0)
group by "time"
Redshift响应错误
Amazon无效操作:JSON解析错误 详细信息:
错误:JSON解析错误 码:8001 上下文:无效的json数组对象
鉴于此commons_element
是内置函数json_extract_array_element_text
的输出,因此我看不到它如何成为无效的JSON对象,因此不胜感激。
P.S。我知道使用所有这些内联选择都是有点麻烦的方法,但这更像是我逐步开发的开发版本,我认为它会起作用。