如何在R中使用SQL脚本

时间:2017-02-19 09:52:02

标签: sql r

我需要编写一个SQL查询

这是我的表格

x <- read.csv("C:/Users/Admin/Downloads/Set 1-1.csv",sep=",",dec=".")
y <- read.csv("C:/Users/Admin/Downloads/Set 1-2 - Copy.csv",sep=",",dec=".")
y$score <- 1

我尝试加入

library("sqldf")
select clientid,emailmessageid,null cnttrn,idatediff,null score from x 
union all select clientid,emailmessageid,cnttrn,idatediff,score from y

但我收到以下错误:

  

从x

选择clientid,emailmessageid,null cnttrn,idatediff,null得分      

错误:“select clientid”中的意外符号

     

联盟所有选择   clientid,emailmessageid,cnttrn,idatediff,y得分

     

错误:   “all all”中的意外符号

请帮助纠正它。谢谢。

dput(x)的

ClientID    EmailMessageId  MinDate MaxDate IdSlip  WwsCreatedDate  ProductArticle  ProductGroupName    MainProductGroupName    CategoryGroupName   QtytItems   SumAmount   iDateDiff
3E34C0C9FC05975CC0F01D7A3DEE73D022538FA04B17A0316178E090C04F84A8    894DB62F7B7A6ED2    31.08.2016  31.08.2016  4A19280A1164CF3F4A701EF9AE97A1F1084B611000B94C02    24.09.2015  item1   item2   item3   item4   1   580.0   -342
3E34C0C9FC05975CC0F01D7A3DEE73D022538FA04B17A0316178E090C04F84A8    894DB62F7B7A6ED2    31.08.2016  31.08.2016  4A19280A1164CF3F4A701EF9AE97A1F1084B611000B94C02    24.09.2015  item1   item2   item3   item4   1   3190.0  -342

dput(y)的

ClientID    EmailMessageId  CntTrn  iDateDiff   score
86139F31664463A8B7592B6887B731A9FC2C3489BB1756A5BF334CFDEA4EF604    9EDCC1391C208BA0    1   4   1
BD483D69913E3EBFE5FBA87A1FFAB7DCD061055FFB4342C2F27AC01F36833254    EF72D53990BC4805    1   5   1
0B3B2F06C3033B3AFD83BA59B405BCC79BC69801FD3B69931F117B8D754A80EB    9EDCC1391C208BA0    1   3   1

1 个答案:

答案 0 :(得分:3)

这对我来说没有错误。唯一的区别是查询格式化。结果是否正确?

library(sqldf)

y <- read.table(text = "ClientID    EmailMessageId  CntTrn  iDateDiff   score
86139F31664463A8B7592B6887B731A9FC2C3489BB1756A5BF334CFDEA4EF604    9EDCC1391C208BA0    1   4   1
BD483D69913E3EBFE5FBA87A1FFAB7DCD061055FFB4342C2F27AC01F36833254    EF72D53990BC4805    1   5   1
0B3B2F06C3033B3AFD83BA59B405BCC79BC69801FD3B69931F117B8D754A80EB    9EDCC1391C208BA0    1   3   1", header = TRUE)

x <- read.table(header = TRUE, text = "ClientID    EmailMessageId  MinDate MaxDate IdSlip  WwsCreatedDate  ProductArticle  ProductGroupName    MainProductGroupName    CategoryGroupName   QtytItems   SumAmount   iDateDiff
3E34C0C9FC05975CC0F01D7A3DEE73D022538FA04B17A0316178E090C04F84A8    894DB62F7B7A6ED2    31.08.2016  31.08.2016  4A19280A1164CF3F4A701EF9AE97A1F1084B611000B94C02    24.09.2015  item1   item2   item3   item4   1   580.0   -342
3E34C0C9FC05975CC0F01D7A3DEE73D022538FA04B17A0316178E090C04F84A8    894DB62F7B7A6ED2    31.08.2016  31.08.2016  4A19280A1164CF3F4A701EF9AE97A1F1084B611000B94C02    24.09.2015  item1   item2   item3   item4   1   3190.0  -342")

sqldf("
SELECT 
  ClientId,
  EmailMessageId,
  null CntTrn,
  iDateDiff,
  null Score 
FROM x 

UNION ALL 

SELECT 
      ClientId,
      EmailMessageId,
      CntTrn,
      iDateDiff,
      Score 
FROM y")