我是一个顽固的使用者,他一直使用=
而不是<-
,显然很多R程序员都会对此表示不满。我编写了formatR
包,可以根据parser
包将=
替换为<-
。正如你们中的一些人可能知道的那样,parser
几天前在CRAN上成了孤儿。虽然现在又回来了,但这让我对依赖它犹豫不决。我想知道是否有另一种方法可以安全地将=
替换为<-
,因为并非所有=
都是平均分配,例如fun(a = 1)
。正则表达式不太可靠(请参阅mask.inline()
中formatR
函数的line 18),但如果您能改进我的话,我一定会感激不尽。也许codetools
包可以提供帮助吗?
一些测试用例:
# should replace
a = matrix(1, 1)
a = matrix(
1, 1)
(a = 1)
a =
1
function() {
a = 1
}
# should not replace
c(
a = 1
)
c(
a = c(
1, 2))
答案 0 :(得分:4)
这个答案使用正则表达式。有一些边缘情况会失败,但大多数代码都应该没问题。如果你需要完美的匹配,那么你需要使用一个解析器,但如果遇到问题,可以随时调整正则表达式。
提防
#quoted function names
`my cr*azily*named^function!`(x = 1:10)
#Nested brackets inside functions
mean(x = (3 + 1:10))
#assignments inside if or for blocks
if((x = 10) > 3) cat("foo")
#functions running over multiple lines will currently fail
#maybe fixable with paste(original_code, collapse = "\n")
mean(
x = 1:10
)
代码基于?regmatches
页面上的示例。基本思路是:交换占位符的函数内容,进行替换,然后将函数内容放回去。
#Sample code. For real case, use
#readLines("source_file.R")
original_code <- c("a = 1", "b = mean(x = 1)")
#Function contents are considered to be a function name,
#an open bracket, some stuff, then a close bracket.
#Here function names are considered to be a letter or
#dot or underscore followed by optional letters, numbers, dots or
#underscores. This matches a few non-valid names (see ?match.names
#and warning above).
function_content <- gregexpr(
"[[:alpha:]._][[:alnum:._]*\\([^)]*\\)",
original_code
)
#Take a copy of the code to modify
copy <- original_code
#Replace all instances of function contents with the word PLACEHOLDER.
#If you have that word inside your code already, things will break.
copy <- mapply(
function(pattern, replacement, x)
{
if(length(pattern) > 0)
{
gsub(pattern, replacement, x, fixed = TRUE)
} else x
},
pattern = regmatches(copy, function_content),
replacement = "PLACEHOLDER",
x = copy,
USE.NAMES = FALSE
)
#Replace = with <-
copy <- gsub("=", "<-", copy)
#Now substitute back your function contents
(fixed_code <- mapply(
function(pattern, replacement, x)
{
if(length(replacement) > 0)
{
gsub(pattern, replacement, x, fixed = TRUE)
} else x
},
pattern = "PLACEHOLDER",
replacement = regmatches(original_code, function_content),
x = copy,
USE.NAMES = FALSE
))
#Write back to your source file
#writeLines(fixed_code, "source_file_fixed.R")
答案 1 :(得分:4)
Kohske向formatR
个软件包发送了pull request,该软件包使用codetools
软件包解决了问题。基本思想是设置代码遍历器来遍历代码;当它将=
检测为函数调用的符号时,它将被<-
替换。这是由于R的“Lisp性质”:x = 1
实际上是`=`(x, 1)
(我们将其替换为`<-`(x, 1)
);当然,在=
的解析树中对fun(x = 1)
的处理方式不同。
formatR
包(&gt; = 0.5.2)已经摆脱了对parser
包的依赖,replace.assign
现在应该是健壮的。
答案 2 :(得分:-3)
=
替换<-
的最安全(也可能是最快)方法是直接键入<-
,而不是尝试替换它。