SQL - 使用数字和字母搜索列中特定范围的最佳方法是什么?

时间:2017-06-12 17:25:54

标签: mysql r sqldf

我有一个包含字母和数字的代码表,特别是aa00到​​ZZ99。我无法找到搜索此列的最佳方法,例如dd01 - GG99。最好的方法是什么? (我正在使用带有RStudio的sqldf)

我尝试过使用诸如之间的语句,但结果不是我要找的。实际上,它显示的是相反的字母而不是小写字母:

SELECT prodcode
  FROM data
 WHERE prodcat BETWEEN 'GG99' AND 'dd01';

编辑时间太长,无法发表评论:

library(ggvis)
library(readr)
library(dplyr)
library(knitr)
library(sqldf)
library(tidyr)
data <- read_csv("C:/Users/name/Documents/test1.csv")
compn <-read_csv("C:/Users/name/Documents/test2.csv")


prodcode <- expand.grid(x1 = LETTERS,
                        x2 = letters,
                        x3 = 0:9,
                        x4 = 0:9)
prodcode$prodcat <- apply(data, 1, paste0, collapse = "")




test <- sqldf("SELECT prod
               FROM data, compn
               WHERE data.cono = compn.cono
               AND (SELECT * FROM prodcode 
                        WHERE (SUBSTR(UPPER(prodcat), 1, 2) >= 'DD' AND 
                        CAST(SUBSTR(prodcat, 3, 2) AS INT) >= 00 ) AND
                        (SUBSTR(UPPER(prodcat), 1, 2) <= 'GG' AND
                        CAST(SUBSTR(prodcat, 3, 2) AS INT) <= 99);
              GROUP BY prod
              ORDER BY prod ASC;")


test

2 个答案:

答案 0 :(得分:2)

这里是将获得您建议的SQL代码。在SQL中,您需要将字母和数字分开以进行比较。由于您的数字具有固定的宽度,因此您无法转换为INT。如果您有非固定宽度的数值,则必须确定适当的排序行为。

prodcode <- expand.grid(x1 = LETTERS,
                        x2 = letters,
                        x3 = 0:9,
                        x4 = 0:9)
prodcode$prodcat <- apply(prodcode, 1, paste0, collapse = "")

library(sqldf)

sqldf(
  "SELECT * FROM prodcode 
   WHERE (SUBSTR(UPPER(prodcat), 1, 2) >= 'DD' AND 
            CAST(SUBSTR(prodcat, 3, 2) AS INT) >= 00 ) AND
         (SUBSTR(UPPER(prodcat), 1, 2) <= 'GG' AND
            CAST(SUBSTR(prodcat, 3, 2) AS INT) <= 99)"
)

在子查询中使用

proddata <- data.frame(prodcode = c("DD15", "BB08", "FQ17", "NN11"),
                       value = rnorm(4, 100, 15))

prodcode <- expand.grid(x1 = LETTERS,
                        x2 = letters,
                        x3 = 0:9,
                        x4 = 0:9)
prodcode$prodcat <- apply(prodcode, 1, paste0, collapse = "")

library(sqldf)

sqldf(
  "SELECT * 
   FROM proddata
   WHERE prodcode IN (SELECT UPPER(prodcat) FROM prodcode 
                      WHERE (SUBSTR(UPPER(prodcat), 1, 2) >= 'DD' AND 
                      CAST(SUBSTR(prodcat, 3, 2) AS INT) >= 00 ) AND
                      (SUBSTR(UPPER(prodcat), 1, 2) <= 'GG' AND
                      CAST(SUBSTR(prodcat, 3, 2) AS INT) <= 99))"
)

答案 1 :(得分:0)

您可能会考虑grepl和正确的伴随正则表达式公式,并与dplyr select命令配对。目前很难帮助你而不能看到你的测试&#34;数据框实际上由。 (IE,没有可重现的例子)