我认为通过提出一个更简单的问题来引用一个更简单的数据示例here,我能够得到我所需要的东西,但我仍然需要一些帮助。
我非常擅长在BigQuery中查询json样式数据,并且遇到Firebase为我转储到BigQuery的分析(事件)数据时出现问题。下面是1行数据的格式(修剪了一些绒毛)。
{
"user_dim": {
"user_id": "some_identifier_here",
"user_properties": [
{
"key": "special_key1",
"val": {
"val": {
"str_val": "894",
"int_val": null
}
}
},
{
"key": "special_key2",
"val": {
"val": {
"str_val": "1",
"int_val": null
}
}
},
{
"key": "special_key3",
"val": {
"val": {
"str_val": "23",
"int_val": null
}
}
}
],
"device_info": {
"device_category": "mobile",
"mobile_brand_name": "Samsung",
"mobile_model_name": "model_phone"
},
"dt_a": "1470625311138000",
"dt_b": "1470620345566000"
},
"event_dim": [
{
"name": "user_engagement",
"params": [
{
"key": "firebase_event_origin",
"value": {
"string_value": "auto",
"int_value": null,
"float_value": null,
"double_value": null
}
},
{
"key": "engagement_time_msec",
"value": {
"string_value": null,
"int_value": "30006",
"float_value": null,
"double_value": null
}
}
],
"timestamp_micros": "1470675614434000",
"previous_timestamp_micros": "1470675551092000"
},
{
"name": "new_game",
"params": [
{
"key": "total_time",
"value": {
"string_value": "496048",
"int_value": null,
"float_value": null,
"double_value": null
}
},
{
"key": "armor",
"value": {
"string_value": "2",
"int_value": null,
"float_value": null,
"double_value": null
}
},
{
"key": "reason",
"value": {
"string_value": "power_up",
"int_value": null,
"float_value": null,
"double_value": null
}
}
],
"timestamp_micros": "1470675825988001",
"previous_timestamp_micros": "1470675282500001"
},
{
"name": "user_engagement",
"params": [
{
"key": "firebase_event_origin",
"value": {
"string_value": "auto",
"int_value": null,
"float_value": null,
"double_value": null
}
},
{
"key": "engagement_time_msec",
"value": {
"string_value": null,
"int_value": "318030",
"float_value": null,
"double_value": null
}
}
],
"timestamp_micros": "1470675972778002",
"previous_timestamp_micros": "1470675614434002"
},
{
"name": "won_game",
"params": [
{
"key": "total_time",
"value": {
"string_value": "497857",
"int_value": null,
"float_value": null,
"double_value": null
}
},
{
"key": "level",
"value": {
"string_value": null,
"int_value": "207",
"float_value": null,
"double_value": null
}
},
{
"key": "sword",
"value": {
"string_value": "iron",
"int_value": null,
"float_value": null,
"double_value": null
}
}
],
"timestamp_micros": "1470677171374007",
"previous_timestamp_micros": "1470671343784007"
}
]
}
根据我原来问题的答案,我能够在对象的第一部分user_dim
上正常工作。但是,每当我尝试类似event_dim
字段的方法(取消它)时,查询就会失败并显示消息"错误:标量子查询产生了多个元素。"我怀疑这是因为event_dim
本身就是一个数组,并且包含其中也包含数组的结构。
如果它有帮助,这是给我错误的基本查询,虽然应该注意到我完全不在我的元素中使用BQ中的这种类型的数据并且可能完全偏离正轨:
SELECT
(SELECT name FROM UNNEST(event_dim) WHERE name = 'user_engagement') AS event_name
FROM
my_table;
我要去的最终结果是一个查询,它可以将包含许多这些类型对象的表转换为一个表,在每个对象中为每个事件输出1行event_dim
阵列。即,对于上面的示例对象,我希望它输出4行,其中第一组列是相同的,并且只是来自user_dim
的元数据。然后我想根据我知道每个可能事件存在的内容明确定义的列,例如event_name, firebase_event_origin, engagement_time_msec, total_time, armor, reason, level, sword
,然后填充该事件参数的值,如果没有,则为NULL。存在。
答案 0 :(得分:6)
基于Mikhail的回答,但是基于实际的Firebase数据集:
SELECT
user_dim.app_info.app_instance_id,
timestamp_micros,
(SELECT value.int_value FROM UNNEST(dim.params) WHERE key = "level") AS level,
(SELECT value.int_value FROM UNNEST(dim.params) WHERE key = "coins") AS coins,
(SELECT value.int_value FROM UNNEST(dim.params) WHERE key = "powerups") AS powerups
FROM `dataset.table`, UNNEST(event_dim) AS dim
WHERE timestamp_micros=1464718937589000
(将其保存在此处以供将来参考,以及更轻松的复制可匹配性)
答案 1 :(得分:3)
希望,下面可以给你下一步推送
WITH YourTable AS (
SELECT ARRAY[
STRUCT(
"user_engagement" AS name,
ARRAY<STRUCT<key STRING, val STRUCT<str_val STRING, int_val INT64>>>[
STRUCT("firebase_event_origin", STRUCT("auto", NULL)),
STRUCT("engagement_time_msec", STRUCT("30006", NULL))] AS params,
1470675614434000 AS TIMESTAMP_MICROS,
1470675551092000 AS previous_timestamp_micros
),
STRUCT(
"new_game" AS name,
ARRAY<STRUCT<key STRING, val STRUCT<str_val STRING, int_val INT64>>>[
STRUCT("total_time", STRUCT("496048", NULL)),
STRUCT("armor", STRUCT("2", NULL)),
STRUCT("reason", STRUCT("power_up", NULL))] AS params,
1470675825988001 AS TIMESTAMP_MICROS,
1470675282500001 AS previous_timestamp_micros
),
STRUCT(
"user_engagement" AS name,
ARRAY<STRUCT<key STRING, val STRUCT<str_val STRING, int_val INT64>>>[
STRUCT("firebase_event_origin", STRUCT("auto", NULL)),
STRUCT("engagement_time_msec", STRUCT("318030", NULL))] AS params,
1470675972778002 AS TIMESTAMP_MICROS,
1470675614434002 AS previous_timestamp_micros
),
STRUCT(
"won_game" AS name,
ARRAY<STRUCT<key STRING, val STRUCT<str_val STRING, int_val INT64>>>[
STRUCT("total_time", STRUCT("497857", NULL)),
STRUCT("level", STRUCT("207", NULL)),
STRUCT("sword", STRUCT("iron", NULL))] AS params,
1470677171374007 AS TIMESTAMP_MICROS,
1470671343784007 AS previous_timestamp_micros
)
] AS event_dim
)
SELECT
name,
(SELECT val.str_val FROM UNNEST(dim.params) WHERE key = "firebase_event_origin") AS firebase_event_origin,
(SELECT val.str_val FROM UNNEST(dim.params) WHERE key = "engagement_time_msec") AS engagement_time_msec,
(SELECT val.str_val FROM UNNEST(dim.params) WHERE key = "total_time") AS total_time,
(SELECT val.str_val FROM UNNEST(dim.params) WHERE key = "armor") AS armor,
(SELECT val.str_val FROM UNNEST(dim.params) WHERE key = "reason") AS reason,
(SELECT val.str_val FROM UNNEST(dim.params) WHERE key = "level") AS level,
(SELECT val.str_val FROM UNNEST(dim.params) WHERE key = "sword") AS sword
FROM YourTable, UNNEST(event_dim) AS dim