左连接以在Google大查询中填充来自2个表的数据

时间:2016-05-17 19:13:46

标签: google-bigquery

以下是2个表RawDebug和CarrierDetails。在RawDebug中,如果DebugData是VER%,那么ActualDebugData是Verizon,如果DebugData是一个数字,首先我们必须用'替换其他字符,如(?,")。 ',然后我们必须查找CarrierDetails表以选择其网络,其中Mcc = substr(" 310410",0,3)和Mnc = substr(" 310410",4, 2)。然后将此网络填充到ActualDebugData。

表RawDebug:

 HardwareId DebugData ActualDebugData
 123        VER%      Verizon
 456        310410?   Bell

Table CarrierDetails:

 Mcc Mnc Network
 310 410 Bell

我尝试过:

 SELECT 
 HardwareId, DebugReason, DebugData, 
 CASE
  WHEN lower(DebugData) LIKE 'ver%' THEN 'Verizon'
  WHEN REGEXP_MATCH(DebugData,'\\d+') THEN c.Network
  ELSE REGEXP_REPLACE(DebugData,'\\?',' ')
 END
 AS ActualDebugData 
 FROM (
   SELECT 
   HardwareId, DebugReason, DebugData, 
   INTEGER(SUBSTR(DebugData,0,3)) AS d1,    INTEGER(SUBSTR(REGEXP_REPLACE(DebugData,'^[a-zA-Z0-9]',' '),4,LENGTH(DebugData)-1)) as d2 
   FROM TABLE_DATE_RANGE([bigdata:RawDebug.T],TIMESTAMP('2016-05-15'),TIMESTAMP('2016-05-15'))
   WHERE DebugReason = 50013
   ) AS d
   LEFT JOIN (
   SELECT 
   Network, Mcc, Mnc
   FROM [bigdata:RawDebug.CarrierDetails] 
   ) AS c
   ON c.Mcc = d.d1 and c.Mnc = d.d2 
   LIMIT 400

1 个答案:

答案 0 :(得分:2)

请记住 - 答案通常和问题一样好!
希望这会有所帮助,但看到你的问题的历史 - 这可能不是结束:o)

SELECT 
  HardwareId, DebugReason, DebugData,
  CASE
    WHEN LOWER(DebugData) LIKE 'ver%' THEN 'Verizon'
    WHEN REGEXP_MATCH(DebugData,'\\d+') THEN c.Network
    ELSE REGEXP_REPLACE(DebugData,'\\?',' ')
  END AS ActualDebugData 
FROM (
  SELECT 
    HardwareId, DebugReason, DebugData, 
    INTEGER(SUBSTR (DebugData, 1, 3)) AS d1,    
    INTEGER(SUBSTR (DebugData, 4, 3)) AS d2 
  FROM //TABLE_DATE_RANGE([bigdata:RawDebug.T],TIMESTAMP('2016-05-15'),TIMESTAMP('2016-05-15'))
      (SELECT 123 AS HardwareId, 'VER%' AS DebugData, 'Verizon' AS ActualDebugData, 50013 AS DebugReason), // sample data
      (SELECT 456 AS HardwareId, '310410?' AS DebugData, 'Bell' AS ActualDebugData, 50013 AS DebugReason)  // sample data
  WHERE DebugReason = 50013
) AS d
LEFT JOIN (
  SELECT 
    Network, Mcc, Mnc
  FROM //[bigdata:RawDebug.CarrierDetails] 
    (SELECT 310 AS Mcc, 410 AS Mnc, 'Bell' AS Network) // sample data
) AS c
ON c.Mcc = d.d1 AND c.Mnc = d.d2 
LIMIT 400

输出s:

HardwareId  DebugReason DebugData   ActualDebugData  
123         50013       VER%        Verizon  
456         50013       310410?     Bell