使用read.csv.sql从单个列中选择多个值

时间:2014-11-11 09:50:52

标签: r sqldf

我正在使用包read.csv.sql中的sqldf来尝试读取行的子集,其中子集从多个值中进行选择 - 这些值存储在另一个向量中。

我已经破解了一种有效的形式,但我希望看到传递sql语句的正确方法。

以下代码给出了最低限度的示例。

library(sqldf)

# some data
write.csv(mtcars, "mtcars.csv", quote = FALSE, row.names = FALSE)

# values to select from variable 'carb'
cc <- c(1, 2)

# This only selects last value from 'cc' vector
read.csv.sql("mtcars.csv", sql = paste("select * from file where carb = ", cc ))

# So try using the 'in' operator - this works
read.csv.sql("mtcars.csv", sql = "select * from file where carb in (1,2)" ) 

# but this doesn't
read.csv.sql("mtcars.csv", sql = paste("select * from file where carb in ", cc ))

# Finally this works
read.csv.sql("mtcars.csv", sql = paste("select * from file where carb in ", 
                                       paste("(", paste(cc, collapse=",") ,")")))

上面的最后一行有效,但是有更简洁的方法来传递这个声明,谢谢。

2 个答案:

答案 0 :(得分:2)

1)fn $ 可以使用fn$ gsubfn进行替换(由sqldf自动拉入)。请参阅sqldf home page上的fn$示例。在这种情况下,我们有:

fn$read.csv.sql("mtcars.csv", 
  sql = "select * from file where carb in ( `toString(cc)` )")

2)加入另一种方法是创建所需carb值的data.frame并使用它执行连接:

Carbs <- data.frame(carb = cc)
read.csv.sql("mtcars.csv", sql = "select * from Carbs join file using (carb)")

答案 1 :(得分:1)

您可以使用deparse,但我不确定它比您现有的更清洁:

read.csv.sql("mtcars.csv",
             sql = paste("select * from file where carb in ", gsub("c","",deparse(cc)) ))

请注意,这不是一般的解决方案,因为deparse并不总能为您提供正确的字符串。它碰巧在这种情况下起作用。