Group by相同或相似的字符串sql

时间:2016-12-14 07:11:06

标签: sql-server string group-by equals

1)假设我有一个这样的表: -

| id    |  color_code  |      fruit      |
|:------|--------------|----------------:|
| 1     |  000001      |      apple      |     
| 2     |  000001      |      apple      |    
| 3     |  000001      |      apple      |      
| 4     |  000002      |      lemon      |         
| 5     |  000002      |      lemon      |       
| 6     |  000003      |      grapes     |
| 7     |  000003      |      grapes     | 

如何根据sql server中的color_code列分组水果列?

我想是这样的: -

| id    |  color_code  |      fruit      |   group_concat(id)  |
|:------|--------------|-----------------|---------------------|
| 1     |  000001      |      apple      |        1,2,3        |
| 4     |  000002      |      lemon      |        2,5          |
| 6     |  000003      |      grapes     |        6,7          |

2)如果我有3张表(如下图所示)需要加入怎么办?我该怎样才能实现这个目标?

series_no 表:

|   id  |  desc_seriesno  |
|:------|----------------:|
| 7040  |     AU1011      |
| 7041  |     AU1022      |
| 7042  |     AU1033      |
| 7043  |     AU1044      |
| 7044  |     AU1055      |
| 7045  |     AU1066      |

品牌表:

|   id  |  desc_brand     |
|:------|----------------:|
| 1020  |     Audi        |
| 1021  |     Bentley     |
| 1022  |     Ford        |
| 1023  |     BMW         |
| 1024  |     Mazda       |
| 1025  |     Toyota      |

car_info 表格:

|   seriesno_id  |  brand_id  |  color  |
|:---------------|------------|--------:|
|     7040       |    1020    | white   |
|     7040       |    1020    | black   |
|     7040       |    1020    | pink    |
|     7041       |    1021    | yellow  |
|     7041       |    1021    | brown   |
|     7042       |    1022    | purple  |
|     7042       |    1022    | black   |
|     7042       |    1022    | green   |
|     7043       |    1023    | blue    |
|     7044       |    1024    | red     |
|     7045       |    1025    | maroon  |
|     7045       |    1025    | white   |    

这是我对sql server 2014的当前查询: -

SELECT SN.id AS seriesid, B.id AS brandid, B.desc_brand
FROM [db1].[dbo].[series_no] SN
  LEFT JOIN [db1].[dbo].[car_info] CI
  ON CI.seriesno_id = SN.id
  RIGHT JOIN [db1].[dbo].[brand] B
  ON B.id = CI.brand_id
GROUP BY SN.id, B.id
ORDER BY SN.id ASC

但遗憾的是它给了我一个错误,因为我不能用这样的字符串分组。

我希望它是这样的: -

|  seriesid  |   brandid  |   desc_brand  | count |
|:-----------|------------|---------------|-------|
|    7040    |    1020    |     Audi      |   3   |
|    7041    |    1021    |     Bentley   |   2   |
|    7042    |    1022    |     Ford      |   3   |
|    7043    |    1023    |     BMW       |   1   |
|    7044    |    1024    |     Mazda     |   1   |
|    7045    |    1025    |     Toyota    |   2   |

3 个答案:

答案 0 :(得分:0)

使用以下代码..

SELECT distinct
       m.color_code
     , m.fruit      
     , group_concat = STUFF((
          SELECT ',' + CONVERT(varchar(10),md.id) 
          FROM dbo.tablename md
          WHERE m.fruit = md.fruit and m.color_code = md.color_code 
          FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 1, '')
FROM dbo.tablename  m

第二名:

SELECT SN.id AS seriesid, B.id AS brandid, B.desc_brand ,count(*)
FROM [db1].[dbo].[series_no] SN
  LEFT JOIN [db1].[dbo].[car_info] CI
  ON CI.seriesno_id = SN.id
  RIGHT JOIN [db1].[dbo].[brand] B
  ON B.id = CI.brand_id
GROUP BY SN.id, B.id ,B.desc_brand
ORDER BY 4 ASC

答案 1 :(得分:0)

1水果颜色

假设表名是FruitColor,您可以通过以下查询获得所需的输出 -

SELECT MIN(id) AS id
    , color_code
    , fruit
    , group_concat_id = STUFF((SELECT ',' + CAST(id AS VARCHAR)
                               FROM FruitColor AS fci
                               WHERE fci.fruit = fc.fruit AND fci.color_code = fc.color_code
                               FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 1, '')
FROM FruitColor AS fc
GROUP BY color_code, fruit
ORDER BY id;

2。你可以有几个选项 -

选项(a):显示品牌系列的计数

SELECT seriesno_id AS seriesid, ci.brand_id AS bandid, desc_brand, COUNT(*)    AS [count]
FROM db1.dbo.car_info AS ci
LEFT JOIN db1.dbo.brand AS b ON (b.id = ci.brand_id)
GROUP BY seriesno_id, ci.brand_id, desc_brand;
  • 如果您想显示具有品牌的汽车的数量,您不需要使用系列表。
  • 您可能不需要在品牌表上使用RIGHT JOIN,因为如果品牌表包含记录 不在car_info表中,那么seriesno_id将为null。

选项(b):显示包含或不包含品牌的所有系列的计数

SELECT sn.id AS seriesid, ci.brand_id AS bandid, desc_brand, COUNT(*) AS [count]
FROM db1.dbo.series_no AS sn
LEFT JOIN db1.dbo.car_info AS ci ON (ci.seriesno_id = sn.id)
LEFT JOIN db1.dbo.brand AS b ON (b.id = ci.brand_id)
GROUP BY sn.id, ci.brand_id, desc_brand;

选项(c):解决选择不在GROUP BY中的列的问题

SELECT seriesno_id AS seriesid, ci.brand_id AS bandid, MAX(desc_brand) AS desc_brand, COUNT(*) AS [count]
FROM db1.dbo.car_info AS ci
LEFT JOIN db1.dbo.brand AS b ON (b.id = ci.brand_id)
GROUP BY seriesno_id, ci.brand_id;
  • 在这里,如果我们确定每个品牌只包含一个desc_brand,我们可以在其上使用聚合。 这是因为应用聚合只有一个值返回该值。我在这里使用了MAX。

我个人会选择(a)选项,因为它更有意义。

对desc_brand的GROUP BY异常更新为NTEXT ...

将desc_brand强制转换为NVARCHAR以避免异常。

 CAST(desc_brand AS NVARCHAR(200))

另外,我强烈建议使用VARCHAR / NVARCHAR而不是任何TEXT,CHAR等,因为它们通常会占用更多内存。

答案 2 :(得分:0)

SELECT 
id = SUBSTRING(group_concat,1,1),
color_code,
fruit,
group_concat
FROM(
SELECT distinct
   m.color_code, 
   m.fruit,       
   group_concat = STUFF((SELECT ',' + CONVERT(varchar(10),md.id) 
                         FROM [Test_1].[dbo].[Stuff] md
                         WHERE m.fruit = md.fruit 
                            AND m.color_code = md.color_code 
                         FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 1, '')
FROM [Test_1].[dbo].[Stuff] m)x