将参数从2个选择查询传递到另一个选择查询

时间:2016-07-18 09:23:58

标签: google-bigquery

我试图在选择查询中比较同一列的两个不同值。 这是我的代码,我只传递一个值(来自最近的选择查询)。你能帮我解决一下如何从其他查询中传递第二个值吗? 为了使它更清楚,我想将startTime与endTime = null(我们的用户登录我们的网站并且没有完成订单的时间)与startTime进行比较!= null(用户已登录)并注册了一个订单)。

SELECT
  notFinished,
  finished,
  DATEDIFF(notFinished, finished) as dateDifference,
  emailAddress,
  phone,
  __key__.id
FROM  (
  SELECT
    startTime AS finished
  FROM
    [datastore_dump.Orders]
  WHERE
    emailAddress IN (
    SELECT
      emailAddress
    FROM
      [datastore_dump.Orders]
    WHERE
      endTime IS NULL)
    AND endTime IS NOT NULL and emailAddress is not null ),
(
  SELECT
    emailAddress,
    phone,
    __key__.id,
    startTime AS notFinished
  FROM
    [datastore_dump.Orders]
  WHERE
    endTime IS NULL)

谢谢!

2 个答案:

答案 0 :(得分:1)

考虑以下逻辑:

SELECT
  emailAddress, startTime, endTime, DATEDIFF(endTime, startTime) AS daysDifference
FROM (
  SELECT
    emailAddress, startTime, endTime, status,
    LAG(status) OVER(PARTITION BY emailAddress ORDER BY startTime) AS prevStatus
  FROM (
    SELECT
      emailAddress, startTime, endTime,
      IF(endTime IS NULL, "not-finished", "finished") AS status
    FROM [datastore_dump.Orders]
  )
)
WHERE status = "finished"
AND prevStatus = "not-finished"

它的作用是:
1.根据finished,将每条记录的状态限定为not-finishedendTime 2.找到每条记录的先前状态 - prevStatus
3.对于状态已完成且先前状态未完成的记录 - 计算差异

希望这接近你提出的要求

答案 1 :(得分:0)

这种类型的逻辑更易于在standard SQL中使用WITH子句取消选中"使用传统SQL"框"显示选项" 。为了帮助您入门,您可能需要这样的东西:

WITH CompletedOrders AS (
  SELECT
    startTime AS finished
  FROM
    datastore_dump.Orders
  WHERE
    emailAddress IN (
    SELECT
      emailAddress
    FROM
      datastore_dump.Orders
    WHERE
      endTime IS NULL)
    AND endTime IS NOT NULL and emailAddress IS NOT NULL
), IncompleteOrders AS (
  SELECT
    emailAddress,
    phone,
    __key__.id,
    startTime AS notFinished
  FROM
    datastore_dump.Orders
  WHERE
    endTime IS NULL)
SELECT ...

作为一个工作示例,您可以尝试:

WITH Orders AS (
  SELECT 'foo@example.com' AS email, CURRENT_TIMESTAMP() AS time
  UNION ALL SELECT 'bar@example.com' AS email, NULL AS time
  UNION ALL SELECT
    'baz@example.com',
    TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR) AS time),
CompletedOrders AS (
  SELECT * FROM Orders WHERE time IS NOT NULL),
IncompleteOrders AS (
  SELECT * FROM Orders WHERE time IS NULL)
SELECT
  (SELECT COUNT(*) FROM CompletedOrders) AS completed_count,
  (SELECT COUNT(*) FROM IncompleteOrders) AS incomplete_count;
+-----------------+------------------+
| completed_count | incomplete_count |
+-----------------+------------------+
|               2 |                1 |
+-----------------+------------------+