Question

我需要替换字符向量的某些值：

x <- data.frame(Strings = c("one", "two","three","four","five","four","five","four","five","two","thre","two","three","two","three"), stringsAsFactors = FALSE)
> x
   Strings
1      one
2      two
3    three
4     four
5     five
6     four
7     five
8     four
9     five
10     two
11   three
12     two
13   three
14     two
15   three

在python中，我会这样做：

x["Strings"].replace(["one", "two", "thre","three"], ["One","Two","Three","Three"], inplace=True)

但是在r中，功能replace()并不是很容易工作。在Stackoverflow中有很多用于替换字符串的解决方案，但没有一个具有这种简单性。在r中有可能吗？

Answer 1

如果您只想将每个单词的首字母大写，我们可以使用sub：

x$new <- sub('^([a-z])', '\\U\\1', x$Strings, perl = TRUE)

输出：

   Strings   new
1      one   One
2      two   Two
3    three Three
4     four  Four
5     five  Five
6     four  Four
7     five  Five
8     four  Four
9     five  Five
10     two   Two
11    thre  Thre
12     two   Two
13   three Three
14     two   Two
15   three Three

如果已经有旧词和新词的替换列表，我们可以使用str_replace_all，它的样式类似于python示例OP：

library(stringr)

pattern <- c("one", "two", "thre", "three")
replacements <- c("One", "Two", "Three", "Three")

named_vec <- setNames(replacements, paste0("\\b", pattern, "\\b"))

x$new <- str_replace_all(x$Strings, named_vec)

或使用match或hashmap：

library(dplyr)

x$new <- coalesce(replacements[match(x$Strings, pattern)], x$new)


library(hashmap)

hash_lookup = hashmap(pattern, replacements)
x$new <- coalesce(hash_lookup[[x$Strings]], x$new)

输出：

   Strings   new
1      one   One
2      two   Two
3    three Three
4     four  four
5     five  five
6     four  four
7     five  five
8     four  four
9     five  five
10     two   Two
11    thre Three
12     two   Two
13   three Three
14     two   Two
15   three Three

Answer 2

如果要使用大写字母，则带有capitalize()的Hmisc软件包将起作用。如果我误解了这个问题，我深表歉意。

library(Hmisc)

x <- data.frame(Strings = c("one", "two","three","four","five","four","five","four","five","two","thre","two","three","two","three"), stringsAsFactors = FALSE)

x<-sub("thre[^[:space:]]*", "Three", x$Strings)

xCap<-capitalize(x)

as.data.frame(xCap)
    xCap
1    One
2    Two
3  Three
4   Four
5   Five
6   Four
7   Five
8   Four
9   Five
10   Two
11 Three
12   Two
13 Three
14   Two
15 Three

在子修补程序的注释中感谢@RuiBarradas。

Answer 3

语法接近您的Python代码（使用plyr包）：

x$Strings <- plyr::mapvalues(x$Strings, 
                c("one", "two", "thre","three"),
                c("One","Two","Three","Three")
)

Answer 4

一种方法是将它们转换为因素，然后替换级别

> x <- data.frame(Strings = c("one", "two","three","four","five","four","five","four","five","two","thre","two","three","two","three"), stringsAsFactors = FALSE)
> x$Strings <- as.factor(x$Strings)
> levels(x$Strings) <- c("Five", "Four", "One", "Three", "Three", "Two")
> x
   Strings
1      One
2      Two
3    Three
4     Four
5     Five
6     Four
7     Five
8     Four
9     Five
10     Two
11   Three
12     Two
13   Three
14     Two
15   Three

Answer 5

这里是使用recode的选项。创建一个键/值对列表，然后使用recode将'Strings'中的值与list的'key'匹配，并将其替换为相应的值

library(tidyverse)
lst1 <- list(one = "One", two = "Two", three = "Three", four = "Four", five = "Five")
x %>% 
   mutate(Strings  = recode(Strings, !!! lst1))

注意：假设骆驼是偶然的

Answer 6

x <- data.frame(Strings = c("one", "two","three","four","five","four","five","four","five","two","thre","two","three","two","three"), stringsAsFactors = FALSE)
y=c("one", "two", "thre","three")
z=c("One","Two","Three","Three")


x$Strings=x%>%rowwise()%>%mutate(Strings=if_else(!is.na(z[match(Strings,y)]),
                                                  z[match(Strings,y)],false=Strings))

使用dplyr()，您只需更改y和z。

R等价于python中的string.replace（）

6 个答案: