Use regexp to separate characters from digits in a column

时间:2015-07-08 15:52:18

标签: mysql regex group-concat

I'm trying to figure out how to group concat regexp. See an example of what I want to do (but keep in mind that I have thousand or rows so I can't do it one by one):

first_name  
  Maria05aa      
  John89bcb       
  George07
  Angie53cs

My results would look like this:

colummn_a       column_b     column_c
Maria.05.aa     Maria.aa       05
John8.a9.bcb    John.bcb       a9
George.07       George         07
Angie.b53.cs    Angiecs        b53

How can I achieve these results?

1 个答案:

答案 0 :(得分:0)

我首先会引用this question的答案,即找到字符串中第一个数字的位置。您可以调整相同的查询以查找字符串中最后一个数字的位置。所以,首先,我计算了这些值:

SELECT first_name,
  LEAST(
    if (LOCATE('0', first_name) > 0, LOCATE('0', first_name), 101),
    if (LOCATE('1', first_name) > 0, LOCATE('1', first_name), 101),
    if (LOCATE('2', first_name) > 0, LOCATE('2', first_name), 101),
    if (LOCATE('3', first_name) > 0, LOCATE('3', first_name), 101),
    if (LOCATE('4', first_name) > 0, LOCATE('4', first_name), 101),
    if (LOCATE('5', first_name) > 0, LOCATE('5', first_name), 101),
    if (LOCATE('6', first_name) > 0, LOCATE('6', first_name), 101),
    if (LOCATE('7', first_name) > 0, LOCATE('7', first_name), 101),
    if (LOCATE('8', first_name) > 0, LOCATE('8', first_name), 101),
    if (LOCATE('9', first_name) > 0, LOCATE('9', first_name), 101)
  ) AS firstNumberIndex,
  GREATEST(
    LOCATE('0', first_name),
    LOCATE('1', first_name),
    LOCATE('2', first_name),
    LOCATE('3', first_name),
    LOCATE('4', first_name),
    LOCATE('5', first_name),
    LOCATE('6', first_name),
    LOCATE('7', first_name),
    LOCATE('8', first_name),
    LOCATE('9', first_name)
  ) AS lastNumberIndex
FROM myTable;

我随意选择了101作为最大值,因为我将字符串列的长度设置为100,所以我选择了一个不存在的索引。

有了这些,我用它作为子查询来获取第一个数字左边的子串,并在最后一个数字的右边,以及它们之间,得到你想要的列,如下所示:

SELECT SUBSTRING(first_name, 1, firstNumberIndex - 1) AS firstPiece,
   SUBSTRING(first_name, firstNumberIndex, (lastNumberIndex - firstNumberIndex + 1)) AS numbers, 
   SUBSTRING(first_name, lastNumberIndex + 1) AS lastPiece
FROM(
    mySubquery) tmp;

然后,唯一要做的就是将它们放在你想要的格式中。同样,我使用子查询使其更具可读性,但子查询不是必需的,因为所有数据都来自同一个表。然而,由于某些事情很复杂,我觉得可读性是一个重要的妥协。我注意到你不想在空子串之前添加一个句点,所以我不得不写一些CASE语句:

SELECT
  CASE 
    WHEN numbers = '' THEN firstPirce
    ELSE
      CASE 
        WHEN lastPiece = '' THEN CONCAT(firstPiece, '.', numbers)
        ELSE CONCAT(firstPiece, '.', numbers, '.', lastPiece)
      END
  END AS column_a,
  CASE 
    WHEN lastPiece = '' THEN firstPiece
    ELSE CONCAT(firstPiece, '.', lastPiece)
  END AS column_b,
  numbers AS column_c
FROM(
   myHugeSubquery) tmp;

以下是SQL Fiddle示例。