将许多因子重命名为R中新的相同因子名称

时间:2017-04-19 01:58:49

标签: r

我正在使用餐馆的一些数据来处理个人项目。数据的组织方式有表号标签和去往订单名称。我想将所有要转移的订单名称更改为" togo"同时保留所有表格编号。

示例:

> chknum <- seq(1:10)
> Tble <- c("1","5","12","Togo", "Bob togo","Cheesecake togo","Togo in 15 mins", "To go", "To-go","4")
> data.frame(chknum,Tble)
   chknum            Tble
1       1               1
2       2               5
3       3              12
4       4            Togo
5       5        Bob togo
6       6 Cheesecake togo
7       7 Togo in 15 mins
8       8           To go
9       9           To-go
10     10               4

理想情况下,我希望所有的togo订单都有相同的标签:

> 
> togo <- c("1","5","12",rep("Togo",6),"4")
> data.frame(chknum,togo)
   chknum togo
1       1    1
2       2    5
3       3   12
4       4 Togo
5       5 Togo
6       6 Togo
7       7 Togo
8       8 Togo
9       9 Togo
10     10    4

我已尝试过因子(x)并以我知道的方式重命名,但有数百种不同的订单名称因素,而且我不确定最有效的方法。

3 个答案:

答案 0 :(得分:2)

我们可以将其转换为numeric以获取非数字元素的所有NA元素,并将其替换为“Togo”

df1$Tble[is.na(as.numeric(df1$Tble))] <- "Togo"
df1
#   chknum Tble
#1       1    1
#2       2    5
#3       3   12
#4       4 Togo
#5       5 Togo
#6       6 Togo
#7       7 Togo
#8       8 Togo
#9       9 Togo
#10     10    4

数据

df1 <- data.frame(chknum,Tble, stringsAsFactors=FALSE)

答案 1 :(得分:1)

你可以试试正则表达式,

chknum <- seq(1:10)
Tble <- c("1","5","12","Togo", "Bob togo","Cheesecake togo","Togo in 15 mins", "To go", "To-go","4")
Tble[grepl("[Tt][Oo].*[Gg][Oo]", Tble)] <- "Togo"
cbind(chknum, Tble)

这里表达式"[Tt][Oo].*[Gg][Oo]"的意思是“任何大写'到'后跟'任何',然后是'去'的任何大写”。基本上可以捕捉到您可能看到的任何变化。它很自由,所以它会捕获像“番茄鹅”这样的东西。

答案 2 :(得分:1)

library(stringr)

# grab the numerics first. must be digits (\\d) from beginning(^) to end($). 
# replace with what was found in first between parentheses ie. dont modify
# thsi isnt strictly necessary but left to show how to match numerics.
df$togo <- str_replace(trimws(df$Tble), "^(\\d+)$", "\\1")

# grab any string beginning with to, separated by one or more spaces(\\s) or one or more dashes((\\-)), and ending in go. Ignore case (?i)
# capture the whole string, but replace with "Togo"
df$togo <- str_replace(trimws(df$Tble), "(?i)(.*to(\\s|\\-)*go.*)", "Togo")
df

# chknum            Tble togo
# 1       1               1    1
# 2       2               5    5
# 3       3              12   12
# 4       4            Togo Togo
# 5       5        Bob togo Togo
# 6       6 Cheesecake togo Togo
# 7       7 Togo in 15 mins Togo
# 8       8           To go Togo
# 9       9           To-go Togo
# 10     10               4    4