如何匹配逗号分隔的字符串,而不管它们在Mysql中的位置顺序如何

时间:2014-09-17 10:07:22

标签: mysql

问题:

我想测试一组值等于另一组,但不一定他们的位置顺序是相同的。 例如: ' A,B,C,d'必须等于' a,c,d'

我尝试了什么:

我尝试了 IN 条款,并使用 FIND_IN_SET 进行了检查。

SELECT 'a,b,c,d' IN 'b,c,a,d';

他们都不能做这项工作。

如果有人可以提供帮助,将会感激不尽。

由于 和Sandeep

3 个答案:

答案 0 :(得分:3)

FIND_IN_SET应该这样做,但第一个值是单个值,如果它包含逗号则不能正常工作。您必须查找每个单独的值:

SELECT 
  FIND_IN_SET('a', 'b,c,a,d') AND
  FIND_IN_SET('b', 'b,c,a,d') AND
  FIND_IN_SET('c', 'b,c,a,d') AND
  FIND_IN_SET('d', 'b,c,a,d')

如果您没有可用的这些单独值,则可以将输入值拆分为多个值。问题“Split values to multiple rows”的答案可能会给你一些启发。

更好的解决方案是根本不存储逗号分隔值。这被认为是不好的做法。

答案 1 :(得分:3)

这演示了使用将值拆分为多行,GolezTrol与FIND_IN_SET结合使用,修改后的函数用于以下形式:

SELECT are_sets_equal(col_with_set, 'a,b,d,c') FROM example;

SELECT * FROM example
WHERE are_sets_equal(col_with_set, 'a,b,d,c')

这个想法是这样的:

  • 将第一组拆分为临时表
  • 检查第二组中找到的这些值的数量。
  • 如果此计数等于两个集合中的元素数,则集合相等
  • 如果两个集合相等且0,则函数将返回1,如果集合因需求而异。

两组的限制均为1000个值,但可以轻松扩展:

DELIMITER //
CREATE FUNCTION are_sets_equal(set_a VARCHAR(2000), set_b VARCHAR(2000)) RETURNS BOOLEAN
BEGIN
  DECLARE is_equal BOOLEAN;
  DECLARE count_a INT;
  DECLARE count_b INT;

  -- calculate the count of elements in both sets 
  SET count_a = 1 + LENGTH(set_a) - LENGTH(REPLACE(set_a, ',', ''));
  SET count_b = 1 + LENGTH(set_b) - LENGTH(REPLACE(set_b, ',', ''));

  SELECT
    -- if all elements of the first set are contained in the second
    -- set and both sets have the same number of elements then both
    -- sets are considered equal
    COUNT(t.value) = count_a AND count_a = count_b INTO is_equal
    FROM (
      SELECT
        SUBSTRING_INDEX(SUBSTRING_INDEX(e.col, ',', n.n), ',', -1) value
      FROM ( SELECT set_a AS col ) e
      CROSS JOIN(
        -- build for up to 1000 separated values
        SELECT 
            a.N + b.N * 10 + c.N * 100 + 1 AS n
        FROM
            (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
           ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
           ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
        ORDER BY n
    ) n
      WHERE n.n <= count_a
    ) t
    WHERE FIND_IN_SET(t.value, set_b);

    return is_equal;
END //
DELIMITER ;

解释

建立数字表

SELECT 
    a.N + b.N * 10 + c.N * 100 + 1 AS n
FROM
    (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
   ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
   ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
ORDER BY n

构建一个数字表,其值为1到1000。如何将其扩展到更大的范围应该是显而易见的。

注意这样的数字表可能包含在您的数据库中,因此无需动态创建。

将一个集拆分为一个表

在这个数字表的帮助下,我们可以将值列表拆分为一个表,使用嵌套的SUBSTRING_INDEX调用只从SQL split values to multiple rows中提到的列表中的另一个值中删除一个值:

SELECT
    SUBSTRING_INDEX(SUBSTRING_INDEX(t.col, ',', n.n), ',', -1) value
FROM (SELECT @set_a as col ) t CROSS JOIN (
    -- build for up to 100 separated values
    SELECT 
        a.N + b.N * 10 + c.N * 100 + 1 AS n
    FROM
        (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
       ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
       ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) c
    ORDER BY n
) n
WHERE 
    n <= 1 + LENGTH(@set_a) - LENGTH(REPLACE(@set_a, ',', ''))

计算集合的元素

我们通过WHERE子句中的表达式得到列表中元素的数量:我们还有一个值而不是分隔符的出现。

然后我们通过使用FIND_IN_SET搜索第二组中的值来限制结果。

作为最后一步,我们检查结果中的值的计数与两组中的值的计数,并返回该值。

<强>演示

试用this demo

答案 2 :(得分:0)

您可以使用UDF(用户定义的函数)来比较您的集合。从评论中没有重复的值将在集合中,我几乎没有自定义UDF function提供的@Simon at mso.net {3}}。我计算了第二个逗号分隔列表中的值计数,最后与find_in_set变量中存储的numReturn的匹配结果进行了比较,如果两者都等于返回1,则返回0表示非匹配。请注意,这不适用于集合

中的重复/重复值
DELIMITER $$
DROP FUNCTION IF EXISTS `countMatchingElements`$$
CREATE DEFINER = `root` @`localhost` FUNCTION `countMatchingElements` (
  inFirstList VARCHAR (1000),
  inSecondList VARCHAR (1000)
) RETURNS TINYINT (3) UNSIGNED NO SQL DETERMINISTIC SQL SECURITY INVOKER 
BEGIN
  DECLARE numReturn TINYINT UNSIGNED DEFAULT 0 ;
  DECLARE idsInFirstList TINYINT UNSIGNED ;
  DECLARE currentListItem VARCHAR (255) DEFAULT '' ;
  DECLARE currentID TINYINT UNSIGNED ;
  DECLARE total_values_in_second INT DEFAULT 0 ;
  SET total_values_in_second = ROUND(
    (
      LENGTH(inSecondList) - LENGTH(REPLACE (inSecondList, ',', ''))
    ) / LENGTH(',')
  ) + 1 ;
  SET idsInFirstList = (CHAR_LENGTH(inFirstList) + 1) - CHAR_LENGTH(REPLACE(inFirstList, ',', '')) ;
  SET currentID = 1 ;
  -- Loop over inFirstList, and for each element that is in inSecondList increment numReturn
  firstListLoop :
  REPEAT
    SET currentListItem = SUBSTRING_INDEX(
      SUBSTRING_INDEX(inFirstList, ',', currentID),
      ',',
      - 1
    ) ;
    IF FIND_IN_SET(currentListItem, inSecondList) 
    THEN SET numReturn = numReturn + 1 ;
    END IF ;
    SET currentID = currentID + 1 ;
    UNTIL currentID > idsInFirstList 
  END REPEAT firstListLoop ;
  IF total_values_in_second = numReturn 
  THEN RETURN 1 ;
  ELSE RETURN 0 ;
  END IF ;
END $$

DELIMITER ;

Fiddle Demo