使用R

时间:2019-06-27 12:16:20

标签: r tsql

我遇到以下情况,其中TSQL的过程有点慢,我正在R中寻找可能的下垂度:

I want to do a cross tabulation over example table:

surveyID    QuestionID  AnswerID
1000        1       1
1000        2       3
1000        3       2
1000        4       1
1001        1       3
1001        2       2
1001        3       1
1001        4       3

并获得如下结果:

QuestionIDx QuestionIDy AnswerIDx   AnswerIDy   Frequancy
1       1       1       2       x
1       1       1       3       x
1       1       2       3       x
.....

我基本上是在surveyID上自行离开表,然后使用R的函数tablextabs来获取频率。

INSERT INTO #CrossTabResults([ProtoQuestionIDx], [ProtoQuestionIDy], [AnswerPosIDx], [AnswerPosIDy], [Frequency])
    EXECUTE sp_execute_external_script
  @language =N'R',
  @script=N'
  OutputDataSet <- data.frame(table(InputDataSet$ProtoQuestionIDx, InputDataSet$ProtoQuestionIDy, InputDataSet$AnswerPosIDx, InputDataSet$AnswerPosIDy))
  ', @input_data_1 = N'SELECT [SurveyInstanceID], [ProtoQuestionIDx], [ProtoQuestionIDy], [AnswerPosIDx], [AnswerPosIDy] FROM  #JoinedSurveys'


OR



    INSERT INTO #CrossTabResults([ProtoQuestionIDx], [ProtoQuestionIDy], [AnswerPosIDx], [AnswerPosIDy], [Frequency])
    EXECUTE sp_execute_external_script
  @language =N'R',
  @script=N'
  OutputDataSet <- data.frame(xtabs(~ ProtoQuestionIDx + ProtoQuestionIDy + AnswerPosIDx + AnswerPosIDy, data=InputDataSet))
  ', @input_data_1 = N'SELECT [SurveyInstanceID], [ProtoQuestionIDx], [ProtoQuestionIDy], [AnswerPosIDx], [AnswerPosIDy] FROM  #JoinedSurveys'

我不确定这是正确的方法,也不会提高执行速度。我正在寻找最快的解决方案。任何帮助表示赞赏。

0 个答案:

没有答案