我有下表:
Row ID AltID1 Latitude Longitude AltID2
1 16055000700 292367877 47.724477 -116.826249 83815818845
2 16055000700 292367882 47.724906 -116.827074 83815819235
3 16055000700 292409477 47.720201 -116.804307 83815834156
...
396 16055000800 292413726 47.69276 -116.810874 83814559302
397 16055000800 292413725 47.692863 -116.811014 83814559312
398 16055000800 292414050 47.693109 -116.811462 83814559728
例如一个具有多个具有相同ID的多行组的表。需要弄清楚如何按ID分组并获取与ID关联的AltID1,纬度,经度,AltID2。应该将其导出为CSV,并且需要对其进行设计以便于处理。
最终结果应如下所示:
line 1:
ID Count Data
16055000700 3 "[[292367877, 47.724477, -116.826249, 83815818845] ,[292367882, 47.724906, -116.827074, 83815819235], [292409477,47.720201,-116.804307,83815834156]]"
Line2:
...
第一列是ID,第二列是原始表中与此ID关联的行数,第三列是原始表中AltID1,Latitude,Longitude,AltID2列各具有3个值的数组的数组。
使用此代码获得一些帮助:
WITH
data AS(
SELECT
*
FROM
UNNEST( ARRAY<STRUCT<id int64, altid1 int64, lat float64, lon float64, altid2 int64>>
[(16055000700,
292367877,
47.724477,
-116.826249,
83815818845), (16055000700,
292367882,
47.724906,
-116.827074,
83815819235), (16055000800,
292414050,
47.693109,
-116.811462,
83814559728)]
))
SELECT
id,
CONCAT('[', STRING_AGG(to_json_STRING(ARRAY<float64>[altid1,
lat,
lon,
altid2])), ']')
FROM
data d
GROUP BY
id
如果我有一个表MyTable 使用架构:
FieldName Type Mode
ID INTEGER NULLABLE
altid1 INTEGER NULLABLE
lat FLOAT NULLABLE
lon FLOAT NULLABLE
altid2 INTEGER NULLABLE
如何使用SELECT语句生成此部分,以从MyTable中获取数据?
[(16055000700,
292367877,
47.724477,
-116.826249,
83815818845), (16055000700,
292367882,
47.724906,
-116.827074,
83815819235), (16055000800,
292414050,
47.693109,
-116.811462,
83814559728)]
答案 0 :(得分:1)
您可以使用TO_JSON_STRING()
来获得接近所需结果的结果。然后将这些字符串汇总成一个更大的字符串:
WITH data AS (
SELECT *
FROM `bigquery-public-data.noaa_gsod.gsod2017`
WHERE stn IN ('998258','995011','996080') AND mo="02" AND da<'03'
)
SELECT stn, FORMAT('[%s]', STRING_AGG(values)) values
FROM (
SELECT stn, TO_JSON_STRING([min,max,temp]) values
FROM `data`
)
GROUP BY 1
答案 1 :(得分:1)
以下是用于BigQuery标准SQL
#standardSQL
SELECT ID, COUNT(1) rows_count,
CONCAT('[', STRING_AGG(TO_JSON_STRING([AltID1, Latitude, Longitude, AltID2])), ']') data
FROM `project.dataset.table`
GROUP BY ID
您可以使用问题中的示例数据来进行测试,如上示例所示
#standardSQL
WITH `project.dataset.table` AS (
SELECT 16055000700 ID, 292367877 AltID1, 47.724477 Latitude, -116.826249 Longitude, 83815818845 AltID2 UNION ALL
SELECT 16055000700, 292367882, 47.724906, -116.827074, 83815819235 UNION ALL
SELECT 16055000700, 292409477, 47.720201, -116.804307, 83815834156 UNION ALL
SELECT 16055000800, 292413726, 47.69276, -116.810874, 83814559302 UNION ALL
SELECT 16055000800, 292413725, 47.692863, -116.811014, 83814559312 UNION ALL
SELECT 16055000800, 292414050, 47.693109, -116.811462, 83814559728
)
SELECT ID, COUNT(1) rows_count,
CONCAT('[', STRING_AGG(TO_JSON_STRING([AltID1, Latitude, Longitude, AltID2])), ']') data
FROM `project.dataset.table`
GROUP BY ID
有结果
Row ID rows_count data
1 16055000700 3 [[292367877,47.724477,-116.826249,83815818845],[292367882,47.724906,-116.827074,83815819235],[292409477,47.720201,-116.804307,83815834156]]
2 16055000800 3 [[292413726,47.69276,-116.810874,83814559302],[292413725,47.692863,-116.811014,83814559312],[292414050,47.693109,-116.811462,83814559728]]