从具有多个值的字段中检索数据

时间:2018-09-19 17:31:51

标签: google-bigquery

We have table person. It has sample fields with multiple values like

person

ID name  tripNumber startPlace   endPlace
1  xxx    20         Portland    Atlanta
          25         California  Atlanta
          40         America     Africa
2  EKVV   40         America     Africa
          37         Argentina   Carolina

We need to retrieve entire row of data in particular condition like tripNumber=40 and endPlace="Africa"

We need the result like this,

ID name  tripNumber startPlace   endPlace
1  xxx    40         America     Africa
2  EKVV   40         America     Africa

1 个答案:

答案 0 :(得分:2)

以下是用于BigQuery标准SQL

#standardSQL
WITH `project.dataset.person` AS (
  SELECT 1 id, 'xxx' name, [20, 25, 40] tripNumber, ['Portland', 'California', 'America'] startPlace, ['Atlanta', 'Atlanta', 'Africa'] endPlace UNION ALL
  SELECT 2,  'EKVV', [40, 37], ['America', 'Argentina'], ['Africa', 'Carolina']
)
SELECT id, name, tripNumber, startPlace[SAFE_OFFSET(off)] startPlace, endPlace[SAFE_OFFSET(off)] endPlace 
FROM `project.dataset.person`,
UNNEST(tripNumber) tripNumber WITH OFFSET off
WHERE tripNumber = 40

有结果

Row id  name    tripNumber  startPlace  endPlace     
1   1   xxx     40          America     Africa   
2   2   EKVV    40          America     Africa     

上述解决方案假定您具有独立的重复字段,并且要根据各自数组中的位置进行匹配

以下-基于重复记录的更常见模式

如果person表如下所示

Row id  name    trips.tripNumber    trips.startPlace    trips.endPlace   
1   1   xxx     20                  Portland            Atlanta  
                25                  California          Atlanta  
                40                  America             Africa   
2   2   EKVV    40                  America             Africa   
                37                  Argentina           Carolina        

在这种情况下,解决方案将是

#standardSQL
WITH `project.dataset.person` AS (
  SELECT 1 id, 'xxx' name, [STRUCT<tripNumber INT64, startPlace STRING, endPlace STRING>(20, 'Portland', 'Atlanta'),(25, 'California', 'Atlanta'),(40, 'America', 'Africa')] trips UNION ALL
  SELECT 2, 'EKVV', [STRUCT(40, 'America', 'Africa'),(37, 'Argentina', 'Carolina')]
)
SELECT id, name, tripNumber, startPlace, endPlace 
FROM `project.dataset.person`,
UNNEST(trips) trip
WHERE tripNumber = 40 

有结果

Row id  name    tripNumber  startPlace  endPlace     
1   1   xxx     40          America     Africa   
2   2   EKVV    40          America     Africa