我正在使用包含有关出租车行程数据的数据集。
这些是我用来创建表的数据的json:
这些在此google drive上也可用
您可以忽略供应商查找数据集,因为它只是针对相同数据具有的不同名称的查找。
我的意图是产生这样的结果:
| Line | frequency | month | year |
|------ |----------- |------- |------ |
| 1 | 20 | 1 | 2009 |
| 2 | 35 | 2 | 2009 |
| 3 | 90 | 3 | 2009 |
| 4 | 24 | 4 | 2009 |
| 5 | 12 | 5 | 2009 |
我尝试过的查询看起来像这样:
SELECT COUNT(payment_type) as frequency,
month,
year
FROM
(
SELECT NYCTaxiTrips.pickup_datetime as pickup_datetime,
paymentlookup.string_field_0 as payment_type ,
EXTRACT( MONTH FROM NYCTaxiTrips.pickup_datetime) as month,
EXTRACT( YEAR FROM NYCTaxiTrips.pickup_datetime) as year
FROM `datasprintsteste.datasets.PaymentLookup` as paymentlookup
INNER JOIN
(
SELECT payment_type, pickup_datetime FROM `datasprintsteste.datasets.NYCTaxiTrips2009`
UNION ALL
SELECT payment_type, pickup_datetime FROM `datasprintsteste.datasets.NYCTaxiTrips2010`
UNION ALL
SELECT payment_type, pickup_datetime FROM `datasprintsteste.datasets.NYCTaxiTrips2011`
UNION ALL
SELECT payment_type, pickup_datetime FROM `datasprintsteste.datasets.NYCTaxiTrips2012`
) AS NYCTaxiTrips
ON paymentlookup.string_field_0 = NYCTaxiTrips.payment_type
)
WHERE payment_type = 'Cash'
GROUP BY month, year
但这是他们给的结果:
| Line | frequency | month | year |
|------ |----------- |------- |------ |
| 1 | 1389172 | 1 | 2009 |
我尝试不按年份分组,但是产生错误,我很确定这是语法。
我该如何查询我想要的查询?
答案 0 :(得分:0)
这是您的SQL代码示例,具有较小的数据样本供您播放和测试
WITH `datasprintsteste.datasets.NYCTaxiTrips2009` AS (
SELECT 'cach' AS payment_type, TIMESTAMP('2009-01-01 02:18:18.000') as pickup_datetime UNION ALL
SELECT 'cach' AS payment_type, TIMESTAMP('2009-02-01 02:18:18.000') as pickup_datetime
),
`datasprintsteste.datasets.NYCTaxiTrips2010` AS (
SELECT 'cach' AS payment_type, TIMESTAMP('2010-03-01 02:18:18.000') as pickup_datetime UNION ALL
SELECT 'cach' AS payment_type, TIMESTAMP('2010-04-01 02:18:18.000') as pickup_datetime
),
`datasprintsteste.datasets.NYCTaxiTrips2011` AS (
SELECT 'cach' AS payment_type, TIMESTAMP('2011-03-01 02:18:18.000') as pickup_datetime UNION ALL
SELECT 'cach' AS payment_type, TIMESTAMP('2011-01-01 02:18:18.000') as pickup_datetime
),
`datasprintsteste.datasets.NYCTaxiTrips2012` AS (
SELECT 'cach' AS payment_type, TIMESTAMP('2012-05-01 02:18:18.000') as pickup_datetime UNION ALL
SELECT 'cach' AS payment_type, TIMESTAMP('2012-01-01 02:18:18.000') as pickup_datetime
),
`all` AS (
SELECT * FROM `datasprintsteste.datasets.NYCTaxiTrips2009` UNION ALL
SELECT * FROM `datasprintsteste.datasets.NYCTaxiTrips2010` UNION ALL
SELECT * FROM `datasprintsteste.datasets.NYCTaxiTrips2011` UNION ALL
SELECT * FROM `datasprintsteste.datasets.NYCTaxiTrips2012`
)
SELECT COUNT(payment_type) as frequency,
EXTRACT( MONTH FROM pickup_datetime) as month,
EXTRACT( YEAR FROM pickup_datetime) as year
FROM `all`
GROUP BY month, year
ORDER BY year DESC, month DESC
这产生了按月细分的预期结果(我不仅在数据样本中输入了1月的数据)