流分析左外部联接不产生行

时间:2019-12-05 13:07:59

标签: azure-stream-analytics

我正在尝试做的事情:

我想“限制”输入流到其输出。具体来说,由于我收到多个类似的输入,因此我只想在最近N个小时内未生成任何输出的情况下生成输出。

例如,输入可以被认为是“发送电子邮件”,但是我会收到数十/数百个此类事件。我只想发送电子邮件,如果我在过去N个小时内没有发送过电子邮件(或从未发送过电子邮件)。

请参见此处的最后一个示例:https://docs.microsoft.com/en-us/stream-analytics-query/join-azure-stream-analytics#examples,其内容与我尝试执行的操作类似

我的设置如下:

我的查询有两个输入:

  1. 入口:这是“原始”输入流
  2. 节气已发:这只是我的输出流中的一个消费群体

我的查询如下:

WITH
AllEvents as (
    /* This CTE is here only because we can't seem to use GetMetadataPropertyValue in a join clause, so "materialize" it here for use- later */
    SELECT
        *,
        GetMetadataPropertyValue([Ingress], '[User].[Type]') AS Type,
        GetMetadataPropertyValue([Ingress], '[User].[NotifyType]') AS NotifyType,
        GetMetadataPropertyValue([Ingress], '[User].[NotifyEntityId]') AS NotifyEntityId
    FROM
        [Ingress]
),
UseableEvents as (
    SELECT  *
    FROM    AllEvents
    WHERE   NotifyEntityId IS NOT NULL
),
AlreadySentEvents as (
    /* These are the events that would have been previously output (referenced here via a consumer group). We want to capture these to make sure we are not sending new events when an older "already-sent" event can be found */
    SELECT
        *,
        GetMetadataPropertyValue([Throttled-Sent], '[User].[Type]') AS Type,
        GetMetadataPropertyValue([Throttled-Sent], '[User].[NotifyType]') AS NotifyType,
        GetMetadataPropertyValue([Throttled-Sent], '[User].[NotifyEntityId]') AS NotifyEntityId
    FROM
        [Throttled-Sent]
)

SELECT  i.*
INTO    Throttled
FROM    UseableEvents i
/* Left join our sent events, looking for those within a particular time frame */
LEFT OUTER JOIN AlreadySentEvents s
    ON i.Type = s.Type
    AND i.NotifyType = s.NotifyType
    AND i.NotifyEntityId = s.NotifyEntityId
    AND DATEDIFF(hour, i, s) BETWEEN 0 AND 4
WHERE   s.Type IS NULL /* The is null here is for only returning those Ingress rows that have no corresponding AlreadySentEvents row */

我看到的结果:

此查询在输出中产生无行。但是,我认为它应该产生 something ,因为Throttled-Sent输入的开头是零行。我已经验证了我的Ingress事件正在显示(通过简单地调整查询以删除左联接并检查结果)。

我觉得我的问题可能是 与以下领域之一相关:

  1. 我不能从输出中得到属于消费群体的输入(但是我不知道为什么不允许这样做)
  2. 我的datediff用法/理解不正确

感谢任何帮助/指导/指导!

1 个答案:

答案 0 :(得分:0)

对于节流,我建议您查看IsFirst函数,这可能是更简单的解决方案,不需要从输出中读取信息。

对于当前查询,我认为DATEDIFF参数的顺序需要更改,因为s在i之前出现:DATEDIFF(hour,s,i)在0和4之间