如何从Big Query中的值列中获取第一个not null值?

时间:2015-09-25 19:01:12

标签: sql bigdata google-bigquery

我试图从基于时间戳的值列中提取第一个非空值。有人可以分享你对此的看法。谢谢。

到目前为止我尝试了什么?

rake routes

FIRST_VALUE( column ) OVER ( PARTITION BY id ORDER BY timestamp)

Input :-

4 个答案:

答案 0 :(得分:4)

尝试这个字符串操作的老技巧:

Select 
ID,
  Column,
  ttimestamp,
  LTRIM(Right(CColumn,20)) as CColumn,
  FROM
(SELECT
  ID,
  Column,
  ttimestamp,
  MIN(Concat(RPAD(IF(Column is null, '9999999999999999',STRING(ttimestamp)),20,'0'),LPAD(Column,20,' '))) OVER (Partition by ID) CColumn
FROM (

  SELECT
    *
  FROM (Select 1 as ID, STRING(NULL) as Column, 0.4375 as ttimestamp),
        (Select 1 as ID, STRING(NULL) as Column, 0.438194444444444 as ttimestamp),
        (Select 1 as ID, 'xyz' as Column, 0.438888888888889 as ttimestamp),
        (Select 1 as ID, 'def' as Column, 0.439583333333333 as ttimestamp),
        (Select 2 as ID, STRING(NULL) as Column, 0.479166666666667 as ttimestamp),
        (Select 2 as ID, 'abc' as Column, 0.479861111111111 as ttimestamp)
))

答案 1 :(得分:3)

您可以像这样修改您的sql以获取所需的数据。

FIRST_VALUE( column )
  OVER ( 
    PARTITION BY id
    ORDER BY
      CASE WHEN column IS NULL then 0 ELSE 1 END DESC,
      timestamp
  )

答案 2 :(得分:2)

据我所知,Big Query没有像'IGNORE NULLS'或'NULLS LAST'这样的选项。鉴于此,这是我能想到的最简单的解决方案。我希望看到更简单的解决方案。 假设输入数据在表“original_data”中,

select w2.id, w1.column, w2.timestamp
from
(select id,column,timestamp
   from
     (select id,column,timestamp, row_number() 
                   over (partition BY id ORDER BY timestamp) position
       FROM original_data
       where column is not null
    )
   where position=1 
) w1
right outer join
 original_data as w2
on w1.id = w2.id 

答案 3 :(得分:0)

SELECT id,
    (SELECT top(1)column FROM test1 where id = 1且column is auto order by autoID desc)作为名称     ,时间戳     来自你的表

输出: - 1,' xyz',上午10:30 1,' xyz',上午10:31 1,' xyz',上午10:32 1,' xyz',上午10:33 2,' abc',11:30 am 2,' abc',11:31 am