我有一个表格,其中包含名为fact_interactions
的客户互动的运行历史记录。每次联系客户时,都会创建一条新记录,其中包含有关交互的特定详细信息。这是一个例子:
inter_id |customer_id |business_id |department_id |datetime_local |outcome_id |
---------|------------|------------|--------------|--------------------|-----------|
46032383 |1 |112 |1916 |2015-01-14 19:54:20 |48 |
55740863 |2 |2 |3358 |2015-05-06 12:02:12 |19 |
49512895 |3 |160 |396 |2015-01-22 11:57:17 |19 |
51822751 |3 |160 |396 |2015-01-28 13:46:19 |19 |
23533190 |4 |132 |425 |2015-03-26 12:42:24 |19 |
69354240 |5 |164 |3061 |2015-03-30 11:01:43 |19 |
61417848 |5 |164 |3061 |2015-04-01 14:36:30 |19 |
74948424 |5 |164 |3061 |2015-04-28 15:12:42 |19 |
75303296 |5 |164 |3061 |2015-04-29 13:51:02 |10 |
76071776 |5 |164 |3061 |2015-05-01 09:18:39 |10 |
对于每条记录,我需要在多个时间窗口中找到多个条件匹配的所有行。以下是我正在使用的一些不同子查询的查询示例:
SELECT
inter_id,
(SELECT COUNT(*) FROM fact_interactions B
WHERE B.customer_id = A.customer_id
AND B.business_id = A.business_id
AND B.department_id = A.department_id
AND B.datetime_local::date = A.datetime_local::date
AND B.datetime_local < A.datetime_local) AS cnt_samesamesame_day0
(SELECT COUNT(*) FROM fact_interactions B
WHERE B.customer_id = A.customer_id
AND B.business_id = A.business_id
AND B.department_id <> A.department_id
AND B.datetime_local::date = A.datetime_local::date
AND B.datetime_local < A.datetime_local) AS cnt_samesamediff_day0
(SELECT COUNT(*) FROM fact_interactions B
WHERE B.customer_id = A.customer_id
AND B.business_id <> A.business_id
AND B.department_id <> A.department_id
AND B.datetime_local::date = A.datetime_local::date
AND B.datetime_local < A.datetime_local) AS cnt_samediffdiff_day0
FROM fact_interactions A;
总共我有180个子查询用于我正在尝试计算的计数。因此,如果fact_interaction
有1,000,000条记录,则输出也会有1,000,000条记录,但会有inter_id
加180个计数列。以下是一些示例,说明这些180个计数子查询将被命名为进一步解释:
查询能够完成,但正如您可以想象的那样,非常需要很长时间。只计算cnt_samesamesame_day0
需要一分钟。
很难包含输出结果的样本,因为它非常稀疏。
有关如何更有效地执行此操作的任何建议?非常感谢具体的例子,但即使是更好的一般方法也会令人惊讶。谢谢!
(我正试图在Amazon Redshift群集上实现此功能)
答案 0 :(得分:1)
我可能会建议您了解窗口功能。例如:
cout << static_cast<unsigned int>(newr) << " " << static_cast<unsigned int>(newg) << " " << static_cast<unsigned int>(newb) << endl; // Printing out characters
cout << static_cast<unsigned int>(oldr) << " " << static_cast<unsigned int>(oldg) << " " << static_cast<unsigned int>(oldb) << endl;
其他列可能有类似的构造。