我是R的新手,与R studio一起工作并且很享受。 我基本上试图细分一个SQL语句来用全表名替换所有别名。我已经在这个论坛上大量帮助了下面的代码
下面的代码,采用SQL语句并将其拆分为基本组件 选择 从 WHERE
我使用SQL作为示例,但该语句可以包含任意数量的表和更多别名。目的是编写一个循环,用不同的SQL和别名数替换带有完整表名的别名。目前我的gsub函数只会替换查询中的第二个别名。我想知道是否有人能在我的逻辑中看到错误?
txt <- "SELECT AL1.attr1,AL2.attr2 FROM Table_1 as AL1, Table_2 as AL2 WHERE AL1.attr1 == 1"
###########################################################################################
# First Split the SQL statement into SELECT FROM and WHERE clause (1 Row FOr each)
# Take The From Clause and Split that on Period so AL1.Attrib1 = AL1 Attrib1
# Then split on 'as' so splitting the alias from the actual table name
###########################################################################################
SQLSplit = sapply(strsplit(txt,split="WHERE|FROM|SELECT"),trim)
SQLSegmented = unlist(strsplit(SQLSplit, ".|,", fixed = TRUE))
SplitOnPeriod = sapply(strsplit(SQLSegmented[2],split=","),trim)
SplitOnComma = sapply(strsplit(SplitOnPeriod,split="as"),trim)
for (i in 1:ncol(SplitOnComma))
{
cat(SplitOnComma[1,i])
cat(SplitOnComma[2,i])
test = gsub(SplitOnComma[2,i], SplitOnComma[1,i], SQLSegmented[1])
}
答案 0 :(得分:1)
这可能只是你如何[不]通过for循环存储每次迭代的结果的问题。如果您反复更新下面名为test
的相同字符串,以最终获得一个完全更新的字符串,该怎么办?
test <- SQLSegmented[1]
for (i in 1:ncol(SplitOnComma))
{
cat(SplitOnComma[1,i])
cat(SplitOnComma[2,i])
test = gsub(SplitOnComma[2,i], SplitOnComma[1,i], test)
}
test
答案 1 :(得分:1)
library(stringr)
txt <- "SELECT AL1.attr1,AL2.attr2 FROM Table_1 as AL1, Table_2 as AL2 WHERE AL1.attr1 == 1"
matches <- str_match_all(txt, "([A-Za-z0-9_]+)\ +as\ +([A-Za-z0-9_]+)")
for (i in 1:nrow(matches[[1]])) {
txt <- gsub(sprintf("%s.", matches[[1]][i,3]),
sprintf("%s.", matches[[1]][i,2]),
txt,
fixed=TRUE)
}
txt <- gsub("\ +as\ +[A-Za-z_0-9]+", "", txt)
txt
## [1] "SELECT Table_1.attr1,Table_2.attr2 FROM Table_1, Table_2 WHERE Table_1.attr1 == 1"