SQL计算表中的所有单词出现次数

时间:2018-05-22 21:56:43

标签: sql sql-server reporting-services ssms

我试图从客户调查表创建一个词云。我想计算表格中特定列的所有单词出现次数。该栏目包含所有客户调查评论。我试图按照下面的说明进行操作,但无法弄清楚如何对所有客户评论的所有单词出现进行编码和计数。

http://sqljason.com/2012/03/making-tag-cloud-with-ssrs-rich-text.html

对SQL有些新的道歉。

3 个答案:

答案 0 :(得分:1)

我在MS SQL 2012和2014上进行了测试。

    --Create table
    DECLARE @t TABLE (RowNum int null, comments varchar(max))

    --Build table
    INSERT INTO @t
    (RowNum, comments)
    SELECT ROW_NUMBER() OVER(ORDER BY example.comment DESC) AS RowNum,
           REPLACE(REPLACE(example.comment, '!', ''), ',', '')  FROM
    (
    SELECT 'This website is awesome' AS comment UNION
    SELECT 'I like your website, however it could be better' UNION
    SELECT 'The menu button at the top is really nice!'
    )example

    --Show table
    SELECT * FROM @t

    --Setup vars
    DECLARE @i int = 1
    DECLARE @Count int = (SELECT COUNT(t.RowNum) FROM @t t)
    DECLARE @delimiter varchar(1) = ' '
    DECLARE @output TABLE(splitdata NVARCHAR(MAX))

    --Iterate through a table and build output table
    WHILE @i <= @Count
    BEGIN
        --Do something on one row at a time ie: WHERE(RowNum = @i)
        DECLARE @string varchar(max) = (SELECT t.comments FROM @t t WHERE(t.RowNum = @i))

        DECLARE @start int
        DECLARE @end int 
        SELECT @start = 1, @end = CHARINDEX(@delimiter, @string) 
        WHILE @start < LEN(@string) + 1 BEGIN 
        IF @end = 0  
          SET @end = LEN(@string) + 1
          INSERT INTO @output (splitdata)  
          VALUES(SUBSTRING(@string, @start, @end - @start)) 
          SET @start = @end + 1 
          SET @end = CHARINDEX(@delimiter, @string, @start)
        END 
        SET @i = @i + 1 --Iterate i
    END

    --Show output table
    SELECT * FROM @output

    --Summarize words
    SELECT o.splitdata, COUNT(*) AS Cnt FROM @output o
    GROUP BY o.splitdata
    ORDER BY Cnt DESC

答案 1 :(得分:0)

最简单的方法是: 其中Countwords =要检查的列,WET是您要计算的单词。

Select SUM(CTS) 'TotalOccurancesOfWord'
from
( 
SELECT LEN(countwords) - LEN(REPLACE(countwords, 'wet', '')) CTS
from #temp --table you are using
) a

另见:How to count instances of character in SQL Column

答案 2 :(得分:0)

如果您正在使用SQL Server 2016或更高版本,以下内容可以帮助您:

SELECT 
        REPLACE(REPLACE(s.value, ',', ''), ' ', '') [Word]
      , COUNT(*) Occurrence

FROM  TableName t
           CROSS APPLY STRING_SPLIT( [Comments] , ' ') s

GROUP BY REPLACE(REPLACE(s.value, ',', ''), ' ', '')

如果您使用的是旧版本,早于2016 SQL Server,那么您将需要使用用户定义的[Split String]功能,网上有很多示例,只是google它。但您可以将上述查询与旧版本的[Split String]用户定义函数一起使用。

同样对于SSRS报告,我会根据需要随时填充包含此数据的表,并将报告指向该表,但不建议每次执行报告时都执行此命令。