多次替换字符串

时间:2014-04-20 11:48:29

标签: r loops gsub

我是R的新手,与R studio一起工作并且很享受。 我基本上试图细分一个SQL语句来用全表名替换所有别名。我已经在这个论坛上大量帮助了下面的代码

下面的代码,采用SQL语句并将其拆分为基本组件 选择 从 WHERE

我使用SQL作为示例,但该语句可以包含任意数量的表和更多别名。目的是编写一个循环,用不同的SQL和别名数替换带有完整表名的别名。目前我的gsub函数只会替换查询中的第二个别名。我想知道是否有人能在我的逻辑中看到错误?

txt <- "SELECT AL1.attr1,AL2.attr2 FROM Table_1 as AL1, Table_2 as AL2 WHERE AL1.attr1 == 1"

###########################################################################################
# First Split the SQL statement into SELECT FROM and WHERE clause (1 Row FOr each)
# Take The From Clause and Split that on Period so AL1.Attrib1 = AL1  Attrib1
# Then split on 'as' so splitting the alias from the actual table name
###########################################################################################

Reference

SQLSplit = sapply(strsplit(txt,split="WHERE|FROM|SELECT"),trim)
SQLSegmented = unlist(strsplit(SQLSplit, ".|,", fixed = TRUE))
SplitOnPeriod = sapply(strsplit(SQLSegmented[2],split=","),trim)
SplitOnComma = sapply(strsplit(SplitOnPeriod,split="as"),trim)

for (i in 1:ncol(SplitOnComma)) 
{
    cat(SplitOnComma[1,i])
    cat(SplitOnComma[2,i])
    test = gsub(SplitOnComma[2,i], SplitOnComma[1,i], SQLSegmented[1])
}

2 个答案:

答案 0 :(得分:1)

这可能只是你如何[不]通过for循环存储每次迭代的结果的问题。如果您反复更新下面名为test的相同字符串,以最终获得一个完全更新的字符串,该怎么办?

test <- SQLSegmented[1]
for (i in 1:ncol(SplitOnComma)) 
{
  cat(SplitOnComma[1,i])
  cat(SplitOnComma[2,i])
  test = gsub(SplitOnComma[2,i], SplitOnComma[1,i], test)
}
test

答案 1 :(得分:1)

library(stringr)

txt <- "SELECT AL1.attr1,AL2.attr2 FROM Table_1 as AL1, Table_2 as AL2 WHERE AL1.attr1 == 1"

matches <- str_match_all(txt, "([A-Za-z0-9_]+)\ +as\ +([A-Za-z0-9_]+)")

for (i in 1:nrow(matches[[1]])) {

  txt <- gsub(sprintf("%s.", matches[[1]][i,3]), 
              sprintf("%s.", matches[[1]][i,2]),
              txt,
              fixed=TRUE)

}

txt <- gsub("\ +as\ +[A-Za-z_0-9]+", "", txt)
txt
## [1] "SELECT Table_1.attr1,Table_2.attr2 FROM Table_1, Table_2 WHERE Table_1.attr1 == 1"