Question

为了设置监视，我试图获取特定文件的调用持续时间，将这些数据聚合并将其提供给grafana，但是即使我仔细检查了所有内容，我的查询仍会由于一个奇怪的原因而失败。可以。

在我的redshift数据库中，我有一个传说中的prodloadtimes，其中resource列是该数组中所有调用（网络日志）的JSON数组，我得到了我选择的基于此代码的调用：

              select created,
                    useragent,
                    position('"pathname":"/api/player/screenName' in resource) as pos1,
                    right( resource, len(resource) - pos1) as pathname_start_string,
                    position(',{' in pathname_start_string) as pos2,
                    left (pathname_start_string, pos2-1) as pathname_tail,
                    concat(left (resource, pos1 + pos2 - 1), ']') as string3,
                    json_array_length(string3) as arr_len,
                    json_extract_array_element_text ( string3  , arr_len-1) as commons_element,
                    is_valid_json(commons_element) as "test"

                  from prodloadtimes
                  where resource like '%"pathname":"/api/player/screenName"%'
                    and created between '2018-12-01' and '2018-12-18 12:00'
                    and pos2 > 0

因此，基本上，我在数组中找到了一个需要的元素，切出了所有接下来的元素，所以我需要的是最后一个元素，然后使用函数json_extract_array_element_text (string3, arr_len-1)

提取该元素。

现在，我使用以下代码提取所需的数据：

  select created,
         useragent,
         json_extract_path_text(commons_element,'duration') as duration,
         json_extract_path_text(commons_element,'transferSize') as transfersize,
         json_extract_path_text(commons_element,'name', 'pathname') as pathname,
         json_extract_path_text(commons_element,'encodedBodySize') as size,
         commons_element,
         is_valid_json(commons_element) as "test"
      from (
              select created,
                    useragent,
                    position('"pathname":"/api/player/screenName' in resource) as pos1,
                    right( resource, len(resource) - pos1) as pathname_start_string,
                    position(',{' in pathname_start_string) as pos2,
                    left (pathname_start_string, pos2-1) as pathname_tail,
                    concat(left (resource, pos1 + pos2 - 1), ']') as string3,
                    json_array_length(string3) as arr_len,
                    json_extract_array_element_text ( string3  , arr_len-1) as commons_element,
                    is_valid_json(commons_element) as "test"

                  from prodloadtimes
                  where resource like '%"pathname":"/api/player/screenName"%'
                    and created between '2018-12-01' and '2018-12-18 12:00'
                    and pos2 > 0
                    )   
        where transfersize > 0 
          and duration > 0)

它工作得很好，但是当我尝试使用此方法对数据进行百分位数时：

select floor(extract(epoch from created)/900)*900 AS "time",
  percentile_cont(0.5) within group (order by cast(duration as float4) asc) AS "50th",
  percentile_cont(0.75) within group (order by cast(duration as float4) asc) AS "75th",
  percentile_cont(0.95) within group (order by cast(duration as float4) asc) AS "95th"
  from (
  select created,
         useragent,
         json_extract_path_text(commons_element,'duration') as duration,
         json_extract_path_text(commons_element,'transferSize') as transfersize,
         json_extract_path_text(commons_element,'name', 'pathname') as pathname,
         json_extract_path_text(commons_element,'encodedBodySize') as size,
         commons_element,
         is_valid_json(commons_element) as "test"
      from (
              select created,
                    useragent,
                    position('"pathname":"/api/player/screenName' in resource) as pos1,
                    right( resource, len(resource) - pos1) as pathname_start_string,
                    position(',{' in pathname_start_string) as pos2,
                    left (pathname_start_string, pos2-1) as pathname_tail,
                    concat(left (resource, pos1 + pos2 - 1), ']') as string3,
                    json_array_length(string3) as arr_len,
                    json_extract_array_element_text ( string3  , arr_len-1) as commons_element,
                    is_valid_json(commons_element) as "test"

                  from prodloadtimes
                  where resource like '%"pathname":"/api/player/screenName"%'
                    and created between '2018-12-01' and '2018-12-18 12:00'
                    and pos2 > 0
                    )   
        where transfersize > 0 
          and duration > 0)
  group by "time"

Redshift响应错误

Amazon无效操作：JSON解析错误详细信息：

错误：JSON解析错误码：8001 上下文：无效的json数组对象

鉴于此commons_element是内置函数json_extract_array_element_text的输出，因此我看不到它如何成为无效的JSON对象，因此不胜感激。

P.S。我知道使用所有这些内联选择都是有点麻烦的方法，但这更像是我逐步开发的开发版本，我认为它会起作用。

Redshift中的“无效的json数组对象”，即使可以

0 个答案: