请帮帮我 我有一列Delay_Reason,其中我有以下值。值可能以&#34 ;;#"结束。或者可能不是。
DT1-Increased_CT_Reason_Start_to_Accept
ERIC_Drive Test Taking too Long;#ERIC_Lack Of GSC Resources/Queuing DT Drives;#ERIC_Cluster Having Too Many RF Issues Needing Tuning;#
ERIC_Drive Test Taking too Long;#ERIC_Lack Of GSC Resources/Queuing DT Drives;#
ERIC_Drive Test Taking too Long;#
我必须算一下delay_reason。
我想要的输出是
DT1-Increased_CT_Reason_Start_to_Accept count
ERIC_Drive Test Taking too Long 3
ERIC_Lack Of GSC Resources/Queuing DT Drives 2
ERIC_Cluster Having Too Many RF Issues Needing Tuning 1
答案 0 :(得分:4)
您可以使用的一个技巧是将Delay_Reason
列的长度与删除单个字母的同一列的长度进行比较。然后,在整个表格中总结这个差异以获得出现次数。
SELECT 'a' AS Delay_Reason,
SUM(CHAR_LENGTH(Delay_Reason) - CHAR_LENGTH(REPLACE(Delay_Reason, 'a', ''))) count
FROM yourTable
UNION ALL
SELECT 'b',
SUM(CHAR_LENGTH(Delay_Reason) - CHAR_LENGTH(REPLACE(Delay_Reason, 'b', '')))
FROM yourTable
UNION ALL
SELECT 'c',
SUM(CHAR_LENGTH(Delay_Reason) - CHAR_LENGTH(REPLACE(Delay_Reason, 'c', '')))
FROM yourTable
在这里演示:
<强>更新强>
如果您想使上述查询适用于长于单个字符的术语,您只需要使用搜索词的长度进行规范化:
SELECT 'Los NE abc' AS Delay_Reason,
SUM(CHAR_LENGTH(Delay_Reason) - CHAR_LENGTH(REPLACE(Delay_Reason,'Los NE abc','')))
/ CHAR_LENGTH('Los NE abc') AS count
FROM yourTable
UNION ALL
SELECT 'Angeles',
SUM(CHAR_LENGTH(Delay_Reason) - CHAR_LENGTH(REPLACE(Delay_Reason,'Angeles','')))
/ CHAR_LENGTH('Angeles')
FROM yourTable
UNION ALL
SELECT 'California',
SUM(CHAR_LENGTH(Delay_Reason) - CHAR_LENGTH(REPLACE(Delay_Reason,'California','')))
/ CHAR_LENGTH('California')
FROM yourTable
以下是此查询的演示:
答案 1 :(得分:0)
SELECT Delay_Reason,COUNT(*)
FROM test
GROUP BY Delay_Reason;
答案 2 :(得分:0)
问题是数据库在一个字段中存储多个值。当前架构违反了第3范式。
一种解决方案是运行查询以规范化您的数据,然后对该查询运行常规聚合查询。 substring_index可用于此目的。
CREATE TABLE yourTable (`Delay_Reason` varchar(512));
INSERT INTO yourTable (`Delay_Reason`)
VALUES
('ERIC_Drive Test Taking too Long;#ERIC_Lack Of GSC Resources/Queuing DT Drives;#ERIC_Cluster Having Too Many RF Issues Needing Tuning;#'),
('ERIC_Drive Test Taking too Long;#ERIC_Lack Of GSC Resources/Queuing DT Drives;# '),
('ERIC_Drive Test Taking too Long;#') ;
select delay_reason, count(*) count from (
/*
normalise the data
add as many substring_index union all elements as required
*/
select SUBSTRING_INDEX(delay_reason, ';#', 1) AS delay_reason from yourTable
union all
select SUBSTRING_INDEX(SUBSTRING_INDEX(delay_reason, ';#', 2), ';#', -1) AS delay_reason from yourTable
union all
select SUBSTRING_INDEX(SUBSTRING_INDEX(delay_reason, ';#', 3), ';#', -1) AS delay_reason from yourTable
union all
select SUBSTRING_INDEX(SUBSTRING_INDEX(delay_reason, ';#', 4), ';#', -1) AS delay_reason from yourTable
union all
select SUBSTRING_INDEX(SUBSTRING_INDEX(delay_reason, ';#', 5), ';#', -1) AS delay_reason from yourTable
union all
select SUBSTRING_INDEX(SUBSTRING_INDEX(delay_reason, ';#', 6), ';#', -1) AS delay_reason from yourTable
union all
select SUBSTRING_INDEX(SUBSTRING_INDEX(delay_reason, ';#', 7), ';#', -1) AS delay_reason from yourTable
union all
select SUBSTRING_INDEX(SUBSTRING_INDEX(delay_reason, ';#', 8), ';#', -1) AS delay_reason from yourTable
union all
select SUBSTRING_INDEX(SUBSTRING_INDEX(delay_reason, ';#', 9), ';#', -1) AS delay_reason from yourTable
union all
select SUBSTRING_INDEX(SUBSTRING_INDEX(delay_reason, ';#', 10), ';#', -1) AS delay_reason from yourTable
) delay_reasons
/* remove empty values */
where delay_reason <> ''
group by delay_reason
order by count desc;
- 示例结果
delay_reason count
ERIC_Drive测试耗时过长3
ERIC_Lack GSC资源/排队DT驱动器2
ERIC_Cluster需要太多RF问题需要调优1
如果您有权更改架构,您还可以规范化数据。 Search Google for third normal form explained