我有这种数据:
library(dplyr)
glimpse(full_dat)
Observations: 9,720
Variables: 6
$ Product <chr> "Apple iPhone 4s 8GB Unlocked GSM Smartphone w/ S...
$ Brand <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
$ Price <dbl> 115, 115, 115, 115, 115, 115, 115, 115, 115, 115,...
$ Rating <dbl> 5, 1, 4, 5, 5, 3, 5, 5, 5, 1, 5, 5, 1, 5, 2, 5, 5...
$ Reviews <chr> "It was new and at a great price! Phone came real...
$ Votes <dbl> 2, 1, 0, 1, 2, 2, 2, 5, 2, 0, 0, 0, 0, 0, 0, 0, 0...
我想更改有关字符串的变量Product
的值。例如,如果变量包含模式“ iphone 4s”,我只想将值更改为“ iphone 4s”。
伪代码:
glimpse(full_dat)
Observations: 9,720
Variables: 6
$ Product <chr> "iPhone 4s", "iPhone 4s", "iPhone 4s", "iphone 4s...
$ Brand <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
$ Price <dbl> 115, 115, 115, 115, 115, 115, 115, 115, 115, 115,...
$ Rating <dbl> 5, 1, 4, 5, 5, 3, 5, 5, 5, 1, 5, 5, 1, 5, 2, 5, 5...
$ Reviews <chr> "It was new and at a great price! Phone came real...
$ Votes <dbl> 2, 1, 0, 1, 2, 2, 2, 5, 2, 0, 0, 0, 0, 0, 0, 0, 0...
我读了一篇类似的文章,其中提出了以下解决方案。
full_dat %>%
mutate_at(vars(contains('iphone 4s')), funs(.=='ipohne 4s'))
然而,这并不在我的情况下工作,即剩余的相同的值。
这是一个小样本:
product = c(full_dat$Product[1:5])
dput(product)
c("Apple iPhone 4s 8GB Unlocked GSM Smartphone w/ Siri, iCloud and 8MP Camera - Black",
"Apple iPhone 4s 8GB Unlocked GSM Smartphone w/ Siri, iCloud and 8MP Camera - Black",
"Apple iPhone 4s 8GB Unlocked GSM Smartphone w/ Siri, iCloud and 8MP Camera - Black",
"Apple iPhone 4s 8GB Unlocked GSM Smartphone w/ Siri, iCloud and 8MP Camera - Black",
"Apple iPhone 4s 8GB Unlocked GSM Smartphone w/ Siri, iCloud and 8MP Camera - Black"
)
答案 0 :(得分:1)
我认为您正在寻找
library(dplyr)
samp %>%
mutate_at(vars(Product), funs(replace(., grepl('iPhone 4s', .), 'iphone 4s')))
这会将replace
中任何包含“ iPhone 4s”的Product
更改为仅“ iphone 4s”。
当然,您也可以在没有dplyr
的情况下执行此操作
df$Product <- with(samp, replace(Product, grepl('iPhone 4s', Product),'iPhone 4s'))