BigQuery:将Firestore集合及其子集合加入

时间:2019-06-18 19:55:54

标签: sql join google-cloud-firestore google-bigquery

我从firestore导入了数据,并且有一个集合users和一个子集合profiles。可以在matchingUsers.__key__.name中找到用户的密钥(例如“ USER_KEY”),而个人档案子集合的__key__.path属性等效于'"users", "USER_KEY", "profiles", "PROFILE_KEY"'

我正在尝试获取所有用户的个人资料,因此我正在两个表之间进行联接。例如,我将matchingUsers.__key__.name替换为userId,将profiles.__key__.path替换为path

WITH users AS (
  SELECT "micheleId" AS userId, "Michele" as name UNION ALL
  SELECT "matteoId", "Matteo"
),
profiles AS (
    SELECT "x" AS profileId, '"users", "micheleId", "profiles", "x"' AS path, 'player' AS type UNION ALL
  SELECT "y", '"users", "micheleId", "profiles", "y"', 'coach' UNION ALL
  SELECT "z", '"users", "matteoId", "profiles", "z"', 'team'
)
SELECT userId, profileId, type 
FROM users JOIN profiles ON users.userId IN UNNEST(SPLIT(profiles.path ));

SPLIT到获取数组的路径,然后仅当用户密钥在路径中时才使用IN UNNEST进行连接。

我从中得到的结果是空的,而我完全希望:

+-----------+-----------+--------+
| userId    | profileId | type   |
+-----------+-----------+--------+
| micheleId | x         | player |
| micheleId | y         | coach  |
| matteoId  | z         | team   |
+--------------------------------+

1 个答案:

答案 0 :(得分:1)

下面是一种“修复”查询的方法(更改仅在最后一行)

#standardSQL
WITH users AS (
  SELECT "micheleId" AS userId, "Michele" AS name UNION ALL
  SELECT "matteoId", "Matteo"
),
profiles AS (
  SELECT "x" AS profileId, 'users, micheleId, profiles, x' AS path, 'player' AS type UNION ALL
  SELECT "y", 'users, micheleId, profiles, y', 'coach' UNION ALL
  SELECT "z", 'users, matteoId, profiles, z', 'team'
)
SELECT userId, profileId, type 
FROM users 
JOIN profiles 
ON users.userId IN UNNEST(SPLIT(REPLACE(profiles.path, ' ', '')))

取决于您的实际用例-上面可能会有类似下面的变化

ON users.userId IN UNNEST(SPLIT(profiles.path, ', '))   

OR

ON users.userId IN UNNEST(SPLIT(REGEXP_REPLACE(profiles.path, r'\s', '')))   

...等等

在上述所有情况下-结果均为

Row userId      profileId   type     
1   micheleId   x           player   
2   micheleId   y           coach    
3   matteoId    z           team     
  

糟糕,我错误地添加了路径字符串。正确的格式是问题中已更新的'“ users”,“ micheleId”,“ profiles”,“ x”'

下面也是“修复”

#standardSQL
WITH users AS (
  SELECT "micheleId" AS userId, "Michele" AS name UNION ALL
  SELECT "matteoId", "Matteo"
),
profiles AS (
    SELECT "x" AS profileId, '"users", "micheleId", "profiles", "x"' AS path, 'player' AS type UNION ALL
  SELECT "y", '"users", "micheleId", "profiles", "y"', 'coach' UNION ALL
  SELECT "z", '"users", "matteoId", "profiles", "z"', 'team'
)
SELECT userId, profileId, type 
FROM users JOIN profiles 
ON users.userId IN UNNEST(SPLIT(REGEXP_REPLACE(profiles.path, r'[" ]', '' )))

显然具有相同的结果

Row userId      profileId   type     
1   micheleId   x           player   
2   micheleId   y           coach    
3   matteoId    z           team    

如您所见-修复问题