我需要替换字符向量的某些值:
x <- data.frame(Strings = c("one", "two","three","four","five","four","five","four","five","two","thre","two","three","two","three"), stringsAsFactors = FALSE)
> x
Strings
1 one
2 two
3 three
4 four
5 five
6 four
7 five
8 four
9 five
10 two
11 three
12 two
13 three
14 two
15 three
在python中,我会这样做:
x["Strings"].replace(["one", "two", "thre","three"], ["One","Two","Three","Three"], inplace=True)
但是在r中,功能replace()
并不是很容易工作。在Stackoverflow中有很多用于替换字符串的解决方案,但没有一个具有这种简单性。在r中有可能吗?
答案 0 :(得分:3)
如果您只想将每个单词的首字母大写,我们可以使用sub
:
x$new <- sub('^([a-z])', '\\U\\1', x$Strings, perl = TRUE)
输出:
Strings new
1 one One
2 two Two
3 three Three
4 four Four
5 five Five
6 four Four
7 five Five
8 four Four
9 five Five
10 two Two
11 thre Thre
12 two Two
13 three Three
14 two Two
15 three Three
如果已经有旧词和新词的替换列表,我们可以使用str_replace_all
,它的样式类似于python示例OP:
library(stringr)
pattern <- c("one", "two", "thre", "three")
replacements <- c("One", "Two", "Three", "Three")
named_vec <- setNames(replacements, paste0("\\b", pattern, "\\b"))
x$new <- str_replace_all(x$Strings, named_vec)
或使用match
或hashmap
:
library(dplyr)
x$new <- coalesce(replacements[match(x$Strings, pattern)], x$new)
library(hashmap)
hash_lookup = hashmap(pattern, replacements)
x$new <- coalesce(hash_lookup[[x$Strings]], x$new)
输出:
Strings new
1 one One
2 two Two
3 three Three
4 four four
5 five five
6 four four
7 five five
8 four four
9 five five
10 two Two
11 thre Three
12 two Two
13 three Three
14 two Two
15 three Three
答案 1 :(得分:2)
如果要使用大写字母,则带有capitalize()
的Hmisc软件包将起作用。如果我误解了这个问题,我深表歉意。
library(Hmisc)
x <- data.frame(Strings = c("one", "two","three","four","five","four","five","four","five","two","thre","two","three","two","three"), stringsAsFactors = FALSE)
x<-sub("thre[^[:space:]]*", "Three", x$Strings)
xCap<-capitalize(x)
as.data.frame(xCap)
xCap
1 One
2 Two
3 Three
4 Four
5 Five
6 Four
7 Five
8 Four
9 Five
10 Two
11 Three
12 Two
13 Three
14 Two
15 Three
在子修补程序的注释中感谢@RuiBarradas。
答案 2 :(得分:2)
语法接近您的Python代码(使用plyr
包):
x$Strings <- plyr::mapvalues(x$Strings,
c("one", "two", "thre","three"),
c("One","Two","Three","Three")
)
答案 3 :(得分:1)
一种方法是将它们转换为因素,然后替换级别
> x <- data.frame(Strings = c("one", "two","three","four","five","four","five","four","five","two","thre","two","three","two","three"), stringsAsFactors = FALSE)
> x$Strings <- as.factor(x$Strings)
> levels(x$Strings) <- c("Five", "Four", "One", "Three", "Three", "Two")
> x
Strings
1 One
2 Two
3 Three
4 Four
5 Five
6 Four
7 Five
8 Four
9 Five
10 Two
11 Three
12 Two
13 Three
14 Two
15 Three
答案 4 :(得分:1)
这里是使用recode
的选项。创建一个键/值对列表,然后使用recode
将'Strings'中的值与list
的'key'匹配,并将其替换为相应的值
library(tidyverse)
lst1 <- list(one = "One", two = "Two", three = "Three", four = "Four", five = "Five")
x %>%
mutate(Strings = recode(Strings, !!! lst1))
注意:假设骆驼是偶然的
答案 5 :(得分:0)
x <- data.frame(Strings = c("one", "two","three","four","five","four","five","four","five","two","thre","two","three","two","three"), stringsAsFactors = FALSE)
y=c("one", "two", "thre","three")
z=c("One","Two","Three","Three")
x$Strings=x%>%rowwise()%>%mutate(Strings=if_else(!is.na(z[match(Strings,y)]),
z[match(Strings,y)],false=Strings))
使用dplyr()
,您只需更改y
和z
。