我正在尝试构建仅执行以下功能或这些功能组合的配置单元查询。例如,功能包括
name =“summary”
name =“details”
name1 =“车辆统计数据”
name1 =“accelerometer”
我必须计算严格遵守上述条件的客户数量。例如,在下表中,客户“Joy”不应计算在内,因为他有 另外,即使名义上有“摘要”和“详细信息”,名称中也包含“车辆统计数据”和“加速计”,也会在名称中填写“费用”。
同样地,客户“Lan”不应该被计算,因为他在name1中另外做了“超速”,这不是在上述条件下。
customername name name1
Joy summary vehicle stats
Joy details accelerometer
Joy expenses speeding
Lan summary vehicle stats
Lan details accelerometer
Lan details speeding
Hana details accelerometer
Hana summary vehicle stats
下表的计数必须为1,因为只有1位客户(Hana)在名称和“车辆统计数据”中仅执行了“摘要”和“详细信息”, 名称中的“加速度计”。
这是我目前的查询:
select name, name1, count(distinct(customername))
from table1
where date_time between "2017-01-01 00:00:00" and "2017-01-10 00:00:00"
group by name, name1
having name in ('summary', 'details')
or name1 in ('vehicle stats', 'accelerometer')
任何建议都会很棒!!
答案 0 :(得分:0)
第1部分
select customername
from table1
group by customername
having count
(
case
when name in ('summary', 'details')
or name1 in ('vehicle stats','accelerometer')
then 1
end
) > 0
and count
(
case
when name not in ('summary', 'details')
or name1 not in ('vehicle stats','accelerometer')
then 1
end
) = 0
+--------------+
| customername |
+--------------+
| Hana |
+--------------+
第2部分
select name
,name1
,count(*)
from (select sort_array(collect_set(name)) as name
,sort_array(collect_set(name1)) as name1
from table1
group by customername
having count
(
case
when name in ('summary', 'details')
or name1 in ('vehicle stats','accelerometer')
then 1
end
) > 0
and count
(
case
when name not in ('summary', 'details')
or name1 not in ('vehicle stats','accelerometer')
then 1
end
) = 0
) t
group by name
,name1
+-----------------------+-----------------------------------+----+
| name | name1 | c2 |
+-----------------------+-----------------------------------+----+
| ["details","summary"] | ["accelerometer","vehicle stats"] | 1 |
+-----------------------+-----------------------------------+----+
答案 1 :(得分:0)
您还可以使用collect_set
仅检查这些列中的指定条目。
select customername
from table1
where date_time between "2017-01-01 00:00:00" and "2017-01-10 00:00:00"
group by customername
having concat_ws(',',collect_set(name)) = 'summary,details'
and concat_ws(',',collect_set(name1)) = 'vehicle stats,accelerometer'
你必须排序来自collect_set
的连续输出
为了比较。