我在R中编写一个函数来消除城市名称的向量歧义。基本思路是编写一个函数,只要它们与查找表匹配就返回原始值,否则尝试以各种方式清理数据(例如模糊匹配,删除标点符号等)。
我试图总结一下这个例子中的逻辑:
x <- "sun fish"
s <- function(x) {
if (x=='animal') { #condition A
return(paste(x,"is an animal"))
} else if (x=='fish') { #condition B
return(paste(x,"is a fish"))
} else { #condition C (does some cleaning)
x <- sapply(strsplit(x," "),'[[',2)
return(paste(x, "is something else"))
}
}
s(x)
如果输入条件C,通过条件A和条件B再次传递x
的最佳方法是什么?
答案 0 :(得分:2)
您可以使用递归再次应用测试:
x <- "sun fish"
s <- function(x) {
if (x=='animal') { #condition A
return(paste(x,"is an animal"))
} else if (x=='fish') { #condition B
return(paste(x,"is a fish"))
} else { #condition C (does some cleaning)
y <- sapply(strsplit(x," "),'[[',2)
if(x!=y) return(s(y))
return(paste(x, "is something else"))
}
}
s(x)
[1] "fish is a fish"
上面的代码不适用于 elses 。这应该在保留动物全名的同时修复它:
x <- c("animal", "sun fish", "an other bug")
s <- function(x) {
ifelse(x=='animal',
paste(x,"is an animal"),
ifelse(x=='fish',
paste(x,"is a fish"),
ifelse(lengths(strsplit(x, " "))>1,
paste(sub("([a-z]*) .*", "\\1", x),
s(sub("[a-z]* (.+)", "\\1", x))),
paste(x, "is something else"))))
}
s(x)
[1] "animal is an animal" "sun fish is a fish" "an other bug is something else"
答案 1 :(得分:1)
尝试使用switch()
代替多个if()
来电:
x <- "sun fish"
s <- function(x) {
z <- switch(x,
animal = "is an animal",
fish = "is a fish",
"is something else"
)
paste(x, z)
}
结果:
s(x)
[1] "sun fish is something else"