MYSQL SUBSTRING_INDEX提取列的每个不同字符串

时间:2016-01-03 15:14:52

标签: mysql split substring

我试图在MYSQL中的分隔符之间获取每个不同的字符串值。我尝试使用函数SUBSTRING_INDEX,它适用于第一个字符串和第一个字符串的延续,但不适用于第二个字符串。这就是我的意思:

Table x                    The result

enter image description here

SELECT SUBSTRING_INDEX(path, ':', 2) as p, sum(count) as N From x Group by p UNION
SELECT SUBSTRING_INDEX(path, ':', 3) as p, sum(count) From x Group by p UNION
SELECT SUBSTRING_INDEX(path, ':', 4) as p, sum(count) From x Group by p UNION
SELECT SUBSTRING_INDEX(path, ':', 5) as p, sum(count) From x Group by p UNION
SELECT SUBSTRING_INDEX(path, ':', 6) as p, sum(count) From x Group by p;

我尝试在查询中添加SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(path, ':', 2), ':', 2) as p, sum(count) From x Group by p UNION SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(path, ':', 4), ':', 2) as p, sum(count) From x Group by p,但结果仍然相同。我想要做的是不仅获得字符串A1,A2,A3组合的结果,而且还获得B2,C2,D2的字符串作为第一个字符串,如下表所示:

+---------------+----+
|   p           |  N |
+---------------+----+
| :A1           | 4  |
| ...           | ...|
| :B1           | 3  |
| :B1:C2        | 2  |
|...            | ...|
+---------------+----+

获得结果的正确功能是什么?感谢任何帮助,谢谢。

1 个答案:

答案 0 :(得分:1)

假设路径上的所有字符串节点都是两个字符长,并且所有路径都是相同的长度..

<强>计划

  
      
  • 使用每个块的固定长度2从路径的某些开始到结尾创建一系列有效的子串..
  •   
  • 加入到自身以获得不会走到路径尽头的路径
  •   
  • 使用上面计算的子字符串索引
  • 在x.path上获取子字符串   
  • 汇总上面的x.path子序列
  •   

<强>设置

create table x
(
  path varchar(23) primary key not null,
  count integer not null
);

insert into x
( path, count )
values
( ':A1:B2:C1:D1:G1' , 3 ),
( ':A1:B2:C1:D1:G4' , 1 ),
( ':A2:B1:C2:D2:G4' , 2 )
;

drop view if exists digits_v;
create view digits_v
as
select 0 as n
union all
select 1 union all select 2 union all select 3 union all 
select 4 union all select 5 union all select 6 union all
select 7 union all select 8 union all select 9
;

<强>查询

select substring(x.path, `start`, `len`) as chunk, sum(x.count)
from x
cross join
(
  select o1.`start`, o2.`len`
  from
  (
    select 1 + 3 * seq.n as `start`, 15 - 3 * seq.n as `len`
    from digits_v seq
    where 1 + 3 * seq.n between 1 and 15
    and   15 - 3 * seq.n  between 1 and 15
  ) o1
  inner join
  (
    select 1 + 3 * seq.n as `start`, 15 - 3 * seq.n as `len`
    from digits_v seq
    where 1 + 3 * seq.n between 1 and 15
    and   15 - 3 * seq.n  between 1 and 15
  ) o2
  on  o2.`start` >= o1.`start` 
) splices
where substring(x.path, `start`, `len`) <> ''
group by substring(x.path, `start`, `len`)
order by length(substring(x.path, `start`, `len`)), substring(x.path, `start`, `len`)
;

<强>输出

+-----------------+--------------+
|      chunk      | sum(x.count) |
+-----------------+--------------+
| :A1             |            4 |
| :A2             |            3 |
| :A3             |            3 |
| ...             |          ... |
| :A1:B2          |            4 |
| :A2:B1          |            3 |
| :A3:B3          |            2 |
| :A3:B4          |            1 |
| ...             |          ... |
| :A1:B2:C1       |            4 |
| :A2:B1:C2       |            2 |
| :A2:B1:D2       |            3 |
| :A3:B3:C4       |            2 |
| :A3:B4:C2       |            1 |
| ...             |          ... |
| :A1:B2:C1:D1    |            4 |
| :A2:B1:C2:D2    |            2 |
| :A3:B3:C4:D3    |            2 |
| :A3:B4:C2:D3    |            1 |
| ...             |          ... |
| :A1:B2:C1:D1:G1 |            3 |
| :A1:B2:C1:D1:G4 |            1 |
| :A2:B1:C2:D2:G4 |            2 |
| :A3:B3:C4:D3:G7 |            2 |
| :A3:B4:C2:D3:G7 |            1 |
+-----------------+--------------+

<强> sqlfiddle