使用RODBC连接从R中的SQL查询获取精确百分比

时间:2016-08-16 18:06:57

标签: sql r rodbc

我正在尝试使用RODBC包在R中创建一个函数来遍历SQL数据库中的五个表,以查找每个变量的每年的行数,每个变量的每年%null,以及每个变量每年缺少%。我无法创建单个函数来为我提供这种准确的输出。我创建了一个输出总计数和百分比null的函数,但似乎无法直接生成准确的百分比丢失 - 它似乎是四舍五入到整数,而不是一直向上或向下舍入。以下是我的代码。对此有任何帮助将非常感激。

PctNull <- sqlQuery(channel, "select 
                [EVENT_YEAR] AS 'YEAR', 
                COUNT(*) AS 'TOTAL',
                (((COUNT(CASE WHEN MOTHER_EDUCATION_TRENDABLE = -1 THEN 1 END))*100)/COUNT(*)) AS 'PctMiss',
                (((COUNT(*) - COUNT(MOTHER_EDUCATION_TRENDABLE))*100)/COUNT(*)) AS 'PctNull'


                from [GA_CMH].[dbo].[BIRTHS]

                GROUP BY [EVENT_YEAR]
                ORDER BY [EVENT_YEAR]")

Here is my output and desired format, however I would like to improve my PctMiss accuracy:

1 个答案:

答案 0 :(得分:1)

这是已知的SQL Server情况,如果在表达式中使用整数列,则必须为converted to decimals,您可以通过在表达式中使用至少一个十进制值或使用CASTCONVERT

对于隐式转换,将COUNT()值乘以100.00(带有2个十进制值)或整数值乘以1.00:

PctNull <- sqlQuery(channel, 
                    "SELECT [EVENT_YEAR] AS 'YEAR', 
                            COUNT(*) AS 'TOTAL',
                            (((COUNT(CASE WHEN MOTHER_EDUCATION_TRENDABLE = -1 
                                          THEN 1 
                                     END)) * 100.00) / COUNT(*)) AS 'PctMiss',
                            (((COUNT(*) - COUNT(MOTHER_EDUCATION_TRENDABLE)) * 100.00) /  
                               COUNT(*)) AS 'PctNull'
                     FROM [GA_CMH].[dbo].[BIRTHS]
                     GROUP BY [EVENT_YEAR]
                     ORDER BY [EVENT_YEAR]")

对于显式转换,使用CAST专门声明类型和精度:

(((CAST(COUNT(CASE WHEN MOTHER_EDUCATION_TRENDABLE = -1 
                   THEN 1 
              END)) * 100) AS DECIMAL(10,2)) / COUNT(*))

CONVERT

(((CONVERT(DECIMAL(10,2), COUNT(CASE WHEN MOTHER_EDUCATION_TRENDABLE = -1 
                                     THEN 1 
                                END)) * 100)) / COUNT(*))