我正在使用餐馆的一些数据来处理个人项目。数据的组织方式有表号标签和去往订单名称。我想将所有要转移的订单名称更改为" togo"同时保留所有表格编号。
示例:
> chknum <- seq(1:10)
> Tble <- c("1","5","12","Togo", "Bob togo","Cheesecake togo","Togo in 15 mins", "To go", "To-go","4")
> data.frame(chknum,Tble)
chknum Tble
1 1 1
2 2 5
3 3 12
4 4 Togo
5 5 Bob togo
6 6 Cheesecake togo
7 7 Togo in 15 mins
8 8 To go
9 9 To-go
10 10 4
理想情况下,我希望所有的togo订单都有相同的标签:
>
> togo <- c("1","5","12",rep("Togo",6),"4")
> data.frame(chknum,togo)
chknum togo
1 1 1
2 2 5
3 3 12
4 4 Togo
5 5 Togo
6 6 Togo
7 7 Togo
8 8 Togo
9 9 Togo
10 10 4
我已尝试过因子(x)并以我知道的方式重命名,但有数百种不同的订单名称因素,而且我不确定最有效的方法。
答案 0 :(得分:2)
我们可以将其转换为numeric
以获取非数字元素的所有NA元素,并将其替换为“Togo”
df1$Tble[is.na(as.numeric(df1$Tble))] <- "Togo"
df1
# chknum Tble
#1 1 1
#2 2 5
#3 3 12
#4 4 Togo
#5 5 Togo
#6 6 Togo
#7 7 Togo
#8 8 Togo
#9 9 Togo
#10 10 4
df1 <- data.frame(chknum,Tble, stringsAsFactors=FALSE)
答案 1 :(得分:1)
你可以试试正则表达式,
chknum <- seq(1:10)
Tble <- c("1","5","12","Togo", "Bob togo","Cheesecake togo","Togo in 15 mins", "To go", "To-go","4")
Tble[grepl("[Tt][Oo].*[Gg][Oo]", Tble)] <- "Togo"
cbind(chknum, Tble)
这里表达式"[Tt][Oo].*[Gg][Oo]"
的意思是“任何大写'到'后跟'任何',然后是'去'的任何大写”。基本上可以捕捉到您可能看到的任何变化。它很自由,所以它会捕获像“番茄鹅”这样的东西。
答案 2 :(得分:1)
library(stringr)
# grab the numerics first. must be digits (\\d) from beginning(^) to end($).
# replace with what was found in first between parentheses ie. dont modify
# thsi isnt strictly necessary but left to show how to match numerics.
df$togo <- str_replace(trimws(df$Tble), "^(\\d+)$", "\\1")
# grab any string beginning with to, separated by one or more spaces(\\s) or one or more dashes((\\-)), and ending in go. Ignore case (?i)
# capture the whole string, but replace with "Togo"
df$togo <- str_replace(trimws(df$Tble), "(?i)(.*to(\\s|\\-)*go.*)", "Togo")
df
# chknum Tble togo
# 1 1 1 1
# 2 2 5 5
# 3 3 12 12
# 4 4 Togo Togo
# 5 5 Bob togo Togo
# 6 6 Cheesecake togo Togo
# 7 7 Togo in 15 mins Togo
# 8 8 To go Togo
# 9 9 To-go Togo
# 10 10 4 4