计算整个数据库中唯一元素的数量

时间:2012-05-24 14:09:51

标签: mysql

我有以下用户数据库,每个用户可以在不同的级别上讲不同的语言。

id      langs
12      EN-21
36      EN-2,RU-3
41      EN-9
57      DE-35,EN-28
60      DE-9,RU-14

我想创建一个MySQL查询来计算每个语言的出现次数,而不管它的级别。 所需的标签应如下所示:

lang    count
EN      4
DE      2
RU      2

我已经尝试了不同的组合,但它远非完美。

SELECT 
    DISTINCT SUBSTRING_INDEX(langs, '-', 1) AS lang, 
--  COUNT(langs) as count
--  SUM(
--      (SELECT DISTINCT SUBSTRING_INDEX(langs, '-', 1) 
--      FROM people
--      WHERE langs != '')
--  )
FROM people
WHERE langs != ''
--  GROUP BY lang
ORDER BY lang

2 个答案:

答案 0 :(得分:2)

如果集合中的语言数量有最大限制,您可以拉出所有第一个元素,第二个元素,第三个元素等,并将它们合并在一起。这是一个从语言集中提取任何第一个或第二个元素并将它们组合起来的例子:

select distinct substring_index(langs, '-', 1) as lang
from people where langs != ''
union
select distinct SUBSTRING_INDEX(SUBSTRING_INDEX(langs, '-', 2), ',', -1)
from people where LENGTH(langs) - LENGTH(REPLACE(langs,',','')) + 1 > 1

演示:http://www.sqlfiddle.com/#!2/b86f2/1


从那里开始,通过比较people.langs like '%EN%',可以将语言列表与人员列表相结合并计算匹配数量:

select
  lang,
  count(case when people.langs like concat('%',langs.lang,'%') then 1 end) as count
from people,
  (
    select distinct substring_index(langs, '-', 1) as lang
    from people where langs != ''
    union
    select distinct SUBSTRING_INDEX(SUBSTRING_INDEX(langs, '-', 2), ',', -1)
    from people where LENGTH(langs) - LENGTH(REPLACE(langs,',','')) + 1 > 1
  ) langs
group by langs.lang
order by langs.lang

示例输出:

LANG    COUNT
====    ====
DE      2
EN      4
RU      2

演示:http://www.sqlfiddle.com/#!2/b86f2/5

答案 1 :(得分:0)

SELECT SUBSTRING_INDEX(langs, '-', 1) AS lang, count(1) as count_lang
FROM people
WHERE langs!=''
GROUP BY lang
ORDER BY lang

请尝试一下,让我知道你得到了什么。