选择多个DATE(年)范围内的数据到命名列SQL Server

时间:2015-09-15 08:23:00

标签: sql-server multiple-columns calculated-columns

我必须从SQL Sever表中选择多个日期范围的数据,即

1990-1994,  1992-1996,  1994-1998,  1996-2000,  1998-2002,  2000-2004,  
2002-2006,  2004-2008,  2006-2010,  2008-2012,  2010-2014  

我使用此查询来获取没有DATE范围的数据,即

SELECT 
    aid, research_area_category_id, 
    CAST(research_area as VARCHAR(100)) [research_area],
    COUNT(*) [Counting]
FROM 
    sub_aminer_paper
GROUP BY 
    CAST(research_area as VARCHAR(100)), aid, research_area_category_id
HAVING 
    aid = 12403 
ORDER BY 
    Counting DESC

这给出了图像中的输出,即

enter image description here

现在,对于使用WHERE子句的每个DATE范围,我必须在DATE范围的相应列中显示数据。虽然我使用了这个查询,即

SELECT 
    aid, research_area_category_id, 
    [research_area] = CAST(research_area as VARCHAR(100)), 
    [Counting] = COUNT(*),
    [1990 - 1994] = SUM(CASE WHEN p_year BETWEEN 1990 AND 1994 THEN 1 ELSE 0 END),
    [1992 - 1996] = SUM(CASE WHEN p_year BETWEEN 1992 AND 1996 THEN 1 ELSE 0 END),
    [1994 - 1998] = SUM(CASE WHEN p_year BETWEEN 1994 AND 1998 THEN 1 ELSE 0 END),
    [1996 - 2000] = SUM(CASE WHEN p_year BETWEEN 1996 AND 2000 THEN 1 ELSE 0 END),
    [1998 - 2002] = SUM(CASE WHEN p_year BETWEEN 1998 AND 2002 THEN 1 ELSE 0 END),
    [2000 - 2004] = SUM(CASE WHEN p_year BETWEEN 2000 AND 2004 THEN 1 ELSE 0 END),
    [2002 - 2006] = SUM(CASE WHEN p_year BETWEEN 2002 AND 2006 THEN 1 ELSE 0 END),
    [2004 - 2008] = SUM(CASE WHEN p_year BETWEEN 2004 AND 2008 THEN 1 ELSE 0 END),
    [2006 - 2010] = SUM(CASE WHEN p_year BETWEEN 2006 AND 2010 THEN 1 ELSE 0 END),
    [2008 - 2012] = SUM(CASE WHEN p_year BETWEEN 2008 AND 2012 THEN 1 ELSE 0 END),
    [2010 - 2014] = SUM(CASE WHEN p_year BETWEEN 2010 AND 2014 THEN 1 ELSE 0 END)
FROM 
    sub_aminer_paper
WHERE 
    aid = 2937  
    AND p_year BETWEEN 1990 AND 2014            
GROUP BY
    aid, CAST(research_area AS VARCHAR(100)), research_area_category_id
ORDER BY aid ASC, Counting DESC

此查询输出:

enter image description here

但是我需要(1990-1994,1992-1996,1994-1998 .....等)这些专栏的research_area_category_id值。例如。在1990 - 1994列中,它应显示相应的research_area_category_id1132,而不是Counting,即1,{ {1}}和1类似,它应该在1列中显示33而不是2,反之亦然。

请提前帮助和谢谢。

1 个答案:

答案 0 :(得分:1)

Tab Alleman已经在评论中提到了最好的方法,但我会变得厚颜无耻并将其添加为答案。

您很清楚,您希望显示透视日期列中research_area_category_id列的值。因此,这里的第一步是使research_area_category_id为每个CASE语句的输出,而不是整数1

CASE WHEN p_year BETWEEN 1990 AND 1994 THEN research_area_category_id ELSE 0 END

如果仅使用此更改运行代码,您会发现SUM函数会使输出为research_area_category_id值的倍数。例如,1998 - 2002的第一行将具有值66(两次33)。

因此,这告诉我们您不再需要使用SUM功能。但是,您仍然希望在具有不同p_year值的所有行中聚合(分组)数据,因此您必须使用某种类型的聚合函数。如果不这样做,SQL Server会抛出错误,因为您没有按p_year进行分组。

在这种情况下,最简单的聚合函数是MAX,它从被分组为一行的行集中获取最高值。 official documentation有一些简单的例子。

这仅适用于您的情况,前提是research_area_category_id的所有值均为正(大于0语句默认为的CASE)。

将对CASE语句的更改与SUM更改为MAX相结合,可提供以下版本的查询:

SELECT 
aid, research_area_category_id, 
[research_area] = CAST(research_area as VARCHAR(100)), 
[Counting] = COUNT(*),
[1990 - 1994] = MAX(CASE WHEN p_year BETWEEN 1990 AND 1994 THEN research_area_category_id ELSE 0 END),
[1992 - 1996] = MAX(CASE WHEN p_year BETWEEN 1992 AND 1996 THEN research_area_category_id ELSE 0 END),
[1994 - 1998] = MAX(CASE WHEN p_year BETWEEN 1994 AND 1998 THEN research_area_category_id ELSE 0 END),
[1996 - 2000] = MAX(CASE WHEN p_year BETWEEN 1996 AND 2000 THEN research_area_category_id ELSE 0 END),
[1998 - 2002] = MAX(CASE WHEN p_year BETWEEN 1998 AND 2002 THEN research_area_category_id ELSE 0 END),
[2000 - 2004] = MAX(CASE WHEN p_year BETWEEN 2000 AND 2004 THEN research_area_category_id ELSE 0 END),
[2002 - 2006] = MAX(CASE WHEN p_year BETWEEN 2002 AND 2006 THEN research_area_category_id ELSE 0 END),
[2004 - 2008] = MAX(CASE WHEN p_year BETWEEN 2004 AND 2008 THEN research_area_category_id ELSE 0 END),
[2006 - 2010] = MAX(CASE WHEN p_year BETWEEN 2006 AND 2010 THEN research_area_category_id ELSE 0 END),
[2008 - 2012] = MAX(CASE WHEN p_year BETWEEN 2008 AND 2012 THEN research_area_category_id ELSE 0 END),
[2010 - 2014] = MAX(CASE WHEN p_year BETWEEN 2010 AND 2014 THEN research_area_category_id ELSE 0 END)
FROM 
    sub_aminer_paper
WHERE 
    aid = 2937  
    AND p_year BETWEEN 1990 AND 2014            
GROUP BY
    aid, CAST(research_area AS VARCHAR(100)), research_area_category_id
ORDER BY aid ASC, Counting DESC

如果您感兴趣,我会在this SQL fiddle中模拟您的几行数据,以便在回答之前测试此查询。 (我猜测p_year的值,但他们证明了原则,除非我误解了你的要求。)