发明文本语法

时间:2014-12-03 00:13:10

标签: r

提前感谢您阅读本课题。 我试图在R中编写一个通用的多用途函数来操作特定结构中的文本。 让我用一个例子描述我想要的东西(我试图在这里写ReadDB函数):

ReadDB <- function(query, ...){
...
}
text = "I'm Mahdi; {[What's your name?] Nice to see you <name>.}"

ReadDB(query = text, name = "Mark")
# output is : I'm Mahdi; Nice to see you Mark.

ReadDB(query = text)
# output is : I'm Mahdi; What's your name?

ReadDB(query = text, Age = 22)
# warning is : Age Argument is not used!
# output is : I'm Mahdi; What's your name?

示例说明:

  1. 文字中的每个块都被{}包围。
  2. 每个参数都标有<>两侧。
  3. 如果块中的已定义参数由用户确定为该函数,则[]之间的短语将在块内删除,<arg>将被替换为[价值决定。否则,]()内的所有内容都将被移除。
  4. 更复杂的例子是当一个块内有另一个块时,我们期望在方程中进行与{{运算符相同的优先级排序。

    更新

    我使用嵌套text = "I'm Mahdi; {[What's your name?] Nice to see you <name>.{I live in <city>.}}" ReadDB(query = text, name = "Mark") # output is : I'm Mahdi; Nice to see you Mark. ReadDB(query = text) # output is : I'm Mahdi; What's your name? ReadDB(query = text, city = "St. Louis", name="Mahdi") # output is : I'm Mahdi; Nice to see you Mark. I live in St. Louis. ReadDB(query = text, city = "St. Louis") # output is : I'm Mahdi; What's your name? 添加了更复杂的情况,如下所示:

    {}

    请注意,如果text = {[]I live in <city>.}内没有给出默认值,则为空。因此text = {I live in <city>.}与{{1}}相同。

2 个答案:

答案 0 :(得分:3)

在没有更多说明性示例的情况下,我不清楚描述 但这适用于显示的示例。它提取默认字符串 进入default然后删除{和}以及[和]之间的所有内容。 然后它提取查询中的名称并确定哪个参数 名称未使用。对于那些它发出警告。然后它确定 查询中的哪些名称未被替换,如果有的话 返回带有{的查询,然后将所有内容替换为default; 否则,它将返回query并替换名称。

library(gsubfn)

ReadDB <- function(query, ...) {

    L <- list(...)
    default <- strapplyc(query, "\\[(.*)\\]", simplify = TRUE)

    query2 <- gsub("[{}]", "", query)
    query3 <- gsub("\\[[^]]*\\]", "", query2)

    pat <- "\\<([^>]*)\\>"
    names_in_query <- strapplyc(query3, pat)[[1]]

    args_not_used <- setdiff(names(L), names_in_query)
    for(nm in args_not_used) warning(nm, " not used\n")

    names_not_substituted <- setdiff(names_in_query, names(L))
    if (length(names_not_substituted)) sub("\\{.*", default, query)
    else gsubfn(pattern = pat, L, x = query3)
}

,并提供:

> ReadDB(text)
[1] "I'm Mahdi; What's your name?"
> ReadDB(query = text, name = "Mark")
[1] "I'm Mahdi;  Nice to see you Mark."
> ReadDB(query = text, Age = 22)
[1] "I'm Mahdi; What's your name?"
Warning message:
In ReadDB(query = text, Age = 22) : Age not used

SO的目的不是为海报写代码。它是为了回答编程问题,所以请在下次提供您的代码,如果时间太长则问题不合适,需要缩小规模。

答案 1 :(得分:0)

首先让我感谢格洛腾迪克的聪明回答。 虽然,他的答案仍然无法处理嵌套{{我决定发布我的实现这个问题。希望其他人也可以使用它:

ReadDB <- function(query, ...) {
arg = list(...)
query.string = query
for (query.arg in  names(arg)){
  query.arg_ = paste("<",query.arg,">",sep="")
  if (grepl(query.arg_,query.string, ignore.case = TRUE)){
    query.string = gsub(query.arg_, arg[[query.arg]] , query.string)
  }else{
    warning(paste(query.arg, " argument is not filtered in query!",sep=""))
  }
} # replace given arguments in the text

find_period = function(x){
  ch =c("\\{","\\}","\\[","\\]","<",">")
  A = lapply(ch, function(ch){unlist(ifelse(grepl(ch,x,perl = F),gregexpr(ch, x,perl = F),NA))})
  ind = 2;
  while (ind<= length(A)){
    tmp = NULL
    for (xind in A[[ind]]) 
      tmp = c(tmp,max(setdiff(A[[ind - 1]][A[[ind-1]]<xind], tmp)))
    A[[ind - 1]] = tmp;
    ind = ind + 2;
  }
  names(A)<-ch
  return(A)
}
p = find_period(query.string)

while (!is.na(p[[1]][1]+p[[2]][1])){
  Block.text = substr(x = query.string,p[[1]][1]+1,p[[2]][1]-1)
  p2 = find_period(Block.text)
  if (!is.na(p2[[5]])){
    Block.text = ifelse(is.na(p2[[3]]),"",substr(Block.text,p2[[3]][1]+1,p2[[4]][1]-1))
  }else{
    Block.text = gsub(pattern = "\\[.*\\]",replacement = "",x = Block.text)
  }
  query.string = paste(ifelse(p[[1]][1]==1,"",substr(x = query.string,1,p[[1]][1]-1)),        Block.text,
                       ifelse(p[[2]][1]==nchar(query.string),"",substr(x = query.string,p[[2]][1]+1,nchar(query.string))),sep="")  
  p = find_period(query.string)
}

query.string = gsub(pattern = " {2,}",replacement = " ", x = query.string) # remove double space
return(query.string)
}

这是一个测试:

> text = "I'm Mahdi; {[What's your name?] Nice to see you <name>.{I live in <city>.}}"
> ReadDB(query = text, city = "St. Louis", name="Mike")
[1] "I'm Mahdi; Nice to see you Mike.I live in St. Louis."

> ReadDB(query = text, city = "St. Louis")
[1] "I'm Mahdi; What's your name?"

> ReadDB(query = text, name="Mike")
[1] "I'm Mahdi; Nice to see you Mike."

> ReadDB(query = text, name="Mahdi", Age = 22)
[1] "I'm Mahdi; Nice to see you Mahdi."
Warning message:
In ReadDB(query = text, name = "Mahdi", Age = 22) :
  Age argument is not filtered in query!