在BigQuery中查找匹配行的前后行?

时间:2016-07-09 17:14:32

标签: google-bigquery

是否可以在BigQuery查询中找到匹配行之前和之后的行?例如,如果我这样做:

select textPayload from logs.logs_20160709 where textPayload like "%something%"

并说我得到了这些结果:

something A
something B

如何在匹配的行中显示3行跟随?像这样:

some text 1
some text 2
some text 3
something A
some text 4
some text 5
some text 6
some text 90
some text 91
some text 92
something B
some text 93
some text 94
some text 95

这是可能的,如果是这样的话?

2 个答案:

答案 0 :(得分:2)

在祖马海滩上 - 我想在原来的答案中避免CROSS JOIN 检查下面 - 应该是much cheaper,尤其是大集合

SELECT textPayload
FROM (
  SELECT textPayload, 
    SUM(match) OVER(ORDER BY ts ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING) AS flag
  FROM (
    SELECT textPayload, ts,  IF(textPayload CONTAINS 'something', 1, 0) AS match 
    FROM YourTable
  )
)
WHERE flag > 0

当然,避免交叉连接的另一种方法是使用BigQuery Standard SQL。但仍然 - 上面没有连接的解决方案比我原来的答案

更好

答案 1 :(得分:0)

我认为,你的例子中缺少一个部分 - 将定义顺序的额外字段,所以我在答案中添加了ts字段。这意味着我假设你的表有两个涉及的字段:textPayload和ts

请尝试以下操作。应该准确地给你你需要的东西

SELECT 
  all.textPayload
FROM (
  SELECT start, finish
  FROM (
    SELECT textPayload,
      LAG(ts, 3) OVER(ORDER BY ts ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) AS start, 
      LEAD(ts, 3) OVER(ORDER BY ts ROWS BETWEEN CURRENT ROW AND 3 FOLLOWING) AS finish
    FROM YourTable
  )
  WHERE textPayload CONTAINS 'something'
) AS matches
CROSS JOIN YourTable AS all
WHERE all.ts BETWEEN matches.start AND matches.finish

请注意:取决于您的ts字段的类型 - 您可能需要在此字段的查询中执行一些数据转换。希望不是