如何按该单词组的另一列中的每个单独的单词和总和值进行分组?

时间:2017-04-27 10:28:17

标签: mysql sql

假设我有一个包含这样数据的表:

keyword          times_phrase_searched
-------          ---------------------
open windows     1000
closed windows   750
open doors       350
closed doors     250
nice window      100
nice windows     50
ugly doors       25

我需要什么SQL查询才能分别对每个单词进行分组,并对单词存在的短语的搜索量求和。所以对于示例数据,预期的结果将是:

word             times_word_in_searches
----             ----------------------
windows          1800
open             1350
closed           1000
doors            625
nice             150
window           100
ugly             25

3 个答案:

答案 0 :(得分:1)

如果您没有数字表,请创建一个(请参阅here的方式)。然后,您可以使用此查询来获取字数:

select SUBSTRING_INDEX(SUBSTRING_INDEX(words.keyword, ' ', numbers.n), ' ', -1) word, 
       SUM(words.times_phrase_searched) times_word_in_searches
from numbers 
inner join words on CHAR_LENGTH(words.keyword) - CHAR_LENGTH(REPLACE(words.keyword, ' ', '')) >= numbers.n - 1
group by word
order by sum(words.times_phrase_searched) desc;

这会拆分您的关键字列,以便您在自己的行中获取每个关键字。然后,groupsum就是一件简单的事情。

DEMO

答案 1 :(得分:1)

试试这个......

DECLARE @temp TABLE 
  ( 
     keyword VARCHAR(max), 
     times   INT 
  ) 

INSERT INTO @temp 
VALUES      ('open windows', 
             1000 ), 
            ('closed windows', 
             750 ), 
            ('open doors', 
             350 ), 
            ('closed doors', 
             250 ), 
            ('nice window', 
             100 ), 
            ('nice windows', 
             50 ), 
            ('ugly doors', 
             25 ) 

DECLARE @allValues VARCHAR(max) = (SELECT Stuff((SELECT 
                                          ',' + Replace(p2.keyword, ' ', 
                                          ',') 
                 FROM   @temp p2 
                 ORDER  BY p2.keyword 
                 FOR xml path(''), type).value('.', 'varchar(max)'), 1, 1, '')) 


-- find distinct words 
SELECT DISTINCT t.element, 
                (SELECT Sum(k.times) 
                 FROM   @temp k 
                 WHERE  k.keyword LIKE '%' + t.element + '%') 
FROM   dbo.Func_split(@allValues, ',') t 

函数Func_split(credit:https://stackoverflow.com/a/21428746/710925

CREATE FUNCTION [dbo].[func_Split] 
    (   
    @DelimitedString    varchar(8000),
    @Delimiter              varchar(100) 
    )
RETURNS @tblArray TABLE
    (
    ElementID   int IDENTITY(1,1),  -- Array index
    Element     varchar(1000)               -- Array element contents
    )
AS
BEGIN

    -- Local Variable Declarations
    -- ---------------------------
    DECLARE @Index      smallint,
                    @Start      smallint,
                    @DelSize    smallint

    SET @DelSize = LEN(@Delimiter)

    -- Loop through source string and add elements to destination table array
    -- ----------------------------------------------------------------------
    WHILE LEN(@DelimitedString) > 0
    BEGIN

        SET @Index = CHARINDEX(@Delimiter, @DelimitedString)

        IF @Index = 0
            BEGIN

                INSERT INTO
                    @tblArray 
                    (Element)
                VALUES
                    (LTRIM(RTRIM(@DelimitedString)))

                BREAK
            END
        ELSE
            BEGIN

                INSERT INTO
                    @tblArray 
                    (Element)
                VALUES
                    (LTRIM(RTRIM(SUBSTRING(@DelimitedString, 1,@Index - 1))))

                SET @Start = @Index + @DelSize
                SET @DelimitedString = SUBSTRING(@DelimitedString, @Start , LEN(@DelimitedString) - @Start + 1)

            END
    END

    RETURN
END

答案 2 :(得分:0)

 with tmp(keyword, times_phrase_searched) as(
select times_phrase_searched, LEFT((keyword, CHARINDEX(' ',(keyword+' ')-1),
 STUFF(Data, 1, CHARINDEX(' ',keyword+' '), '')
from TestData
Union ALL
select times_phrase_searched, LEFT((keyword, CHARINDEX(' ',(keyword+' ')-1),
 STUFF(Data, 1, CHARINDEX(' ',keyword+' '), '')
from Test
Where Data >'')
SELECT
   keyword,
   SUM(times_phrase_searched) AS times_phrase_searched   
FROM tmp
GROUP BY keyword;

我不确定,但请检查一下......