我使用以下数据集来计算在30天内使用R中的SQL代码重新住院。
str(tb1)
Classes ‘data.table’ and 'data.frame': 35255 obs. of 56 variables:
$ Source : Factor w/ 2 levels "C","E": 1 1 1 1 1 1 1 1 1 2 ...
$ UniversalID : int 712 123 123 123 156 189 481 ...
$ DOB : Factor w/ 5491 levels "1/1/1997","1/1/1998",..: 2431 2431 2431 2431 2431 2431 2431 2431 2431 459 ...
$ Gender : Factor w/ 2 levels "FEMALE","MALE": 1 1 1 1 1 1 1 1 1 1 ...
$ AgeonDOS : int 14 14 14 14 14 14 14 14 14 17 ...
$ RaceDescription : Factor w/ 7 levels "Asian or Pacific Islander",..: 7 7 7 7 7 7 7 7 7 7 ...
$ ClaimType : Factor w/ 5 levels "C","H","I","L",..: 3 3 3 3 3 3 3 3 3 3 ...
$ ClaimTypeDescription : Factor w/ 5 levels "Home Health Claims",..: 2 2 2 2 2 2 2 2 2 2 ...
$ CYTDOS : int 2014 2014 2014 2014 2014 2014 2014 2014 2014 2014 ...
$ FDOS : Factor w/ 385 levels "1/1/2014","1/10/2014",..: 350 350 350 350 350 350 350 350 350 353 ...
$ TDOS : Factor w/ 365 levels "1/1/2014","1/10/2014",..: 306 306 306 306 306 306 306 306 306 309 ...
当我运行sqldf时,出现以下错误:
> readcount<-sqldf("SELECT A.UniversalID , A.TDOS, B.FDOS, A.PROC,A.seqno
+ FROM tb1 AS A
+ JOIN tb1 AS B
+ ON A.UniversalID = B.UniversalID
+ where B.seqno = A.seqno + 1 and B.FDOS – A.TDOS between 0 and 30
+ ")
Error in sqliteSendQuery(con, statement, bind.data) :
error in statement: near "–": syntax error
如果我不包括最后一个where子句,我不会得到:
> readcount<-sqldf("SELECT A.UniversalID , A.TDOS, B.FDOS, A.PROC,A.seqno
+ FROM tb1 AS A
+ JOIN tb1 AS B
+ ON A.UniversalID = B.UniversalID
+ where B.seqno = A.seqno + 1
+ ")
>
B.FDOS-A.TDOS在0到30之间有什么问题?
我也尝试了以下内容:
> readcount<-sqldf("SELECT A.UniversalID , A.TDOS, B.FDOS, A.PROC,A.seqno
+ FROM tb1 AS A
+ JOIN tb1 AS B
+ ON A.UniversalID = B.UniversalID
+ where B.seqno = A.seqno + 1 and DATEDIFF(A.TDOS,B.FDOS) between 0 and 30
+ ")
Error in sqliteSendQuery(con, statement, bind.data) :
error in statement: no such function: DATEDIFF