在r中查找并替换哈希键值

时间:2018-03-19 10:02:50

标签: r

我的功能如下:

HistolMacDescrip <- function(dataframe, MacroColumn) {
  dataframe <- data.frame(dataframe)

  # Column specific cleanup
  dataframe[, MacroColumn] <- str_replace(dataframe[, MacroColumn],
                                          "[Dd]ictated by.*", "")
  # Conversion of text numbers to allow number of biopsies to be extracted
  dataframe[, MacroColumn] <- str_replace(dataframe[, MacroColumn],
                                          "[Oo]ne", "1")
  dataframe[, MacroColumn] <- str_replace(dataframe[, MacroColumn],
                                          "[Ss]ingle", "1")
  dataframe[, MacroColumn] <- str_replace(dataframe[, MacroColumn],
                                          "[Tt]wo", "2")
  dataframe[, MacroColumn] <- str_replace(dataframe[, MacroColumn],
                                          "[Tt]hree", "3")
  dataframe[, MacroColumn] <- str_replace(dataframe[, MacroColumn],
                                          "[Ff]our", "4")
  dataframe[, MacroColumn] <- str_replace(dataframe[, MacroColumn],
                                          "[Ff]ive", "5")
  dataframe[, MacroColumn] <- str_replace(dataframe[, MacroColumn],
                                          "[Ss]ix", "6")
  dataframe[, MacroColumn] <- str_replace(dataframe[, MacroColumn],
                                          "[Ss]even", "7")
  dataframe[, MacroColumn] <- str_replace(dataframe[, MacroColumn],
                                          "[Ee]ight", "8")
  return(dataframe)
}

我认为重复的数量有点荒谬,代码可能更整洁。我想知道是否可以使用键值查找和替换行。

其中一个问题是输入可能包含多个文字编号,因此一旦找到第一个匹配项,查找和替换就不会停止。

示例输入

d<-c("There are two specimens","Three exist here","Two three four")
d<-data.frame(d)

示例输出

"There are 2 specimens",
"3 exist here",
"2 3 4"

1 个答案:

答案 0 :(得分:0)

我们可以将englishgsubfn

一起使用
library(english)
library(gsubfn)
sub("^([a-z])", "\\U\\1",  gsubfn("\\w+", setNames(as.list(2:4), 
   as.character(english(2:4))), tolower(as.character(d$d))), perl = TRUE)
#[1] "There are 2 specimens" "3 exist here"          "2 3 4"