我已经将firebase导入BigQuery。
我想要做的是,找到始终执行某些事件的特定设备(firebase交互记录)。这意味着,每当这些设备记录在firebase中时,event_dim.name将至少包含该事件类型的一个条目。
例如,请考虑以下查询,其中包含来自(Link)的示例数据:
#standardSQL
SELECT
user_dim.app_info.app_instance_id,
event_dim
FROM `firebase-analytics-sample-data.ios_dataset.app_events_20160607`
假设这有
等数据+------------------+--------------------+
| app_instance_id | event_dim.name |
+------------------+--------------------+
| 1234 | os_update |
| | initialized_rh_api |
+------------------+--------------------+
| 1234 | os_update |
+------------------+--------------------+
| 5678 | os_update |
| | initialized_rh_api |
+------------------+--------------------+
| 5678 | other_action |
+------------------+--------------------+
我想进行查询以获取各个'app_instance_id'的列表,其中event_dim.name包含'os_update'。 根据此标准,对于上述项目, 1234 会匹配,但 5678 则不会。
感谢。可能很简单,但我找不到办法。我可以找到包含该条目的每条记录,但无法消除没有该条目的条目。
答案 0 :(得分:1)
我会使用聚合:
SELECT user_dim.app_info.app_instance_id FROM `firebase-analytics-sample-data.ios_dataset.app_events_20160607`
GROUP BY user_dim.app_info.app_instance_id
HAVING SUM(CASE WHEN event_dim.name NOT LIKE '%os_update%' THEN 1 ELSE 0 END) = 0;
HAVING
子句计算不匹配的事件数。 = 0
表示没有。
答案 1 :(得分:0)
我在Oracle数据库中使用正则表达式和自联接。请查看以下示例。
CREATE TABLE EVENTS (app_instance_id NUMBER, event_dim_name VARCHAR2(100)); --- Sample record INSERT INTO EVENTS VALUES(1234,'os_update initialized_rh_api'); INSERT INTO EVENTS VALUES(1234,'os_update'); INSERT INTO EVENTS VALUES(5678,'os_update initialized_rh_api'); INSERT INTO EVENTS VALUES(5678,'other_action'); INSERT INTO EVENTS VALUES(7895,'os_update initialized_rh_api'); INSERT INTO EVENTS VALUES(7895,'os_update'); INSERT INTO EVENTS VALUES(4567,'os_update initialized_rh_api'); INSERT INTO EVENTS VALUES(4567,'other_action'); -- Sample Query SELECT EV.APP_INSTANCE_ID, EV.EVENT_DIM_NAME FROM (SELECT DISTINCT app_instance_id, regexp_substr(event_dim_name,'^[os_update]+', 1, level) AS"event_dim_name" FROM EVENTS CONNECT BY regexp_substr(event_dim_name, '^[os_update]+', 1, level) IS NOT NULL )TEMP, EVENTS EV WHERE EV.APP_INSTANCE_ID = TEMP.app_instance_id AND EV.EVENT_DIM_NAME = TEMP."event_dim_name";
答案 2 :(得分:0)
下面是BigQuery Standard SQL,并返回所有app_instances
中显示的所有名称#standardSQL
SELECT app_instance_id, name
FROM (
SELECT app_instance_id, COUNT(1) cnt,
ARRAY_CONCAT_AGG(names) names
FROM (
SELECT user_dim.app_info.app_instance_id,
ARRAY(SELECT DISTINCT name FROM UNNEST(event_dim) dim) names
FROM `project.dataset.your_table`
)
GROUP BY app_instance_id
), UNNEST(names) name
GROUP BY app_instance_id, name
HAVING COUNT(1) = ANY_VALUE(cnt)
如果您将针对您问题中的虚拟数据运行它,如下所示
#standardSQL
WITH `project.dataset.your_table` AS (
SELECT STRUCT<app_info STRUCT<app_instance_id STRING>>(STRUCT('1234')) user_dim, [STRUCT<name STRING>('os_update'), STRUCT('initialized_rh_api')] event_dim UNION ALL
SELECT STRUCT(STRUCT('1234')) user_dim, [STRUCT<name STRING>('os_update')] event_dim UNION ALL
SELECT STRUCT(STRUCT('5678')) user_dim, [STRUCT<name STRING>('os_update'), STRUCT('initialized_rh_api')] event_dim UNION ALL
SELECT STRUCT(STRUCT('5678')) user_dim, [STRUCT<name STRING>('other_action')] event_dim
)
SELECT app_instance_id, name
FROM (
SELECT app_instance_id, COUNT(1) cnt,
ARRAY_CONCAT_AGG(names) names
FROM (
SELECT user_dim.app_info.app_instance_id,
ARRAY(SELECT DISTINCT name FROM UNNEST(event_dim) dim) names
FROM `project.dataset.your_table`
)
GROUP BY app_instance_id
), UNNEST(names) name
GROUP BY app_instance_id, name
HAVING COUNT(1) = ANY_VALUE(cnt)
你会得到理想的结果
Row app_instance_id name
1 1234 os_update