R等价于python中的string.replace()

时间:2019-02-20 15:34:16

标签: r string data-manipulation

我需要替换字符向量的某些值:

x <- data.frame(Strings = c("one", "two","three","four","five","four","five","four","five","two","thre","two","three","two","three"), stringsAsFactors = FALSE)
> x
   Strings
1      one
2      two
3    three
4     four
5     five
6     four
7     five
8     four
9     five
10     two
11   three
12     two
13   three
14     two
15   three

在python中,我会这样做:

x["Strings"].replace(["one", "two", "thre","three"], ["One","Two","Three","Three"], inplace=True)

但是在r中,功能replace()并不是很容易工作。在Stackoverflow中有很多用于替换字符串的解决方案,但没有一个具有这种简单性。在r中有可能吗?

6 个答案:

答案 0 :(得分:3)

如果您只想将每个单词的首字母大写,我们可以使用sub

x$new <- sub('^([a-z])', '\\U\\1', x$Strings, perl = TRUE)

输出:

   Strings   new
1      one   One
2      two   Two
3    three Three
4     four  Four
5     five  Five
6     four  Four
7     five  Five
8     four  Four
9     five  Five
10     two   Two
11    thre  Thre
12     two   Two
13   three Three
14     two   Two
15   three Three

如果已经有旧词和新词的替换列表,我们可以使用str_replace_all,它的样式类似于python示例OP:

library(stringr)

pattern <- c("one", "two", "thre", "three")
replacements <- c("One", "Two", "Three", "Three")

named_vec <- setNames(replacements, paste0("\\b", pattern, "\\b"))

x$new <- str_replace_all(x$Strings, named_vec)

或使用matchhashmap

library(dplyr)

x$new <- coalesce(replacements[match(x$Strings, pattern)], x$new)


library(hashmap)

hash_lookup = hashmap(pattern, replacements)
x$new <- coalesce(hash_lookup[[x$Strings]], x$new)

输出:

   Strings   new
1      one   One
2      two   Two
3    three Three
4     four  four
5     five  five
6     four  four
7     five  five
8     four  four
9     five  five
10     two   Two
11    thre Three
12     two   Two
13   three Three
14     two   Two
15   three Three

答案 1 :(得分:2)

如果要使用大写字母,则带有capitalize()的Hmisc软件包将起作用。如果我误解了这个问题,我深表歉意。

library(Hmisc)

x <- data.frame(Strings = c("one", "two","three","four","five","four","five","four","five","two","thre","two","three","two","three"), stringsAsFactors = FALSE)

x<-sub("thre[^[:space:]]*", "Three", x$Strings)

xCap<-capitalize(x)

as.data.frame(xCap)
    xCap
1    One
2    Two
3  Three
4   Four
5   Five
6   Four
7   Five
8   Four
9   Five
10   Two
11 Three
12   Two
13 Three
14   Two
15 Three

在子修补程序的注释中感谢@RuiBarradas。

答案 2 :(得分:2)

语法接近您的Python代码(使用plyr包):

x$Strings <- plyr::mapvalues(x$Strings, 
                c("one", "two", "thre","three"),
                c("One","Two","Three","Three")
)

答案 3 :(得分:1)

一种方法是将它们转换为因素,然后替换级别

> x <- data.frame(Strings = c("one", "two","three","four","five","four","five","four","five","two","thre","two","three","two","three"), stringsAsFactors = FALSE)
> x$Strings <- as.factor(x$Strings)
> levels(x$Strings) <- c("Five", "Four", "One", "Three", "Three", "Two")
> x
   Strings
1      One
2      Two
3    Three
4     Four
5     Five
6     Four
7     Five
8     Four
9     Five
10     Two
11   Three
12     Two
13   Three
14     Two
15   Three

答案 4 :(得分:1)

这里是使用recode的选项。创建一个键/值对列表,然后使用recode将'Strings'中的值与list的'key'匹配,并将其替换为相应的值

library(tidyverse)
lst1 <- list(one = "One", two = "Two", three = "Three", four = "Four", five = "Five")
x %>% 
   mutate(Strings  = recode(Strings, !!! lst1))

注意:假设骆驼是偶然的

答案 5 :(得分:0)

x <- data.frame(Strings = c("one", "two","three","four","five","four","five","four","five","two","thre","two","three","two","three"), stringsAsFactors = FALSE)
y=c("one", "two", "thre","three")
z=c("One","Two","Three","Three")


x$Strings=x%>%rowwise()%>%mutate(Strings=if_else(!is.na(z[match(Strings,y)]),
                                                  z[match(Strings,y)],false=Strings))

使用dplyr(),您只需更改yz