我正在尝试仅过滤我的表中标题列中包含“dog”一词的行,但我无法使其工作。
这是一个数据示例:
ID NozamaItemID NozamaTitle
1 4557 12000017544 Starbucks Double Shot Espresso Light (4 Count, 6.5 Fl Oz Each)
2 4558 12000021992 Pepsi, 8Ct, 12Oz Bottle
3 4559 12000024542 Zuke'S Natural Hip Action dog Treats, 3 Oz
4 4560 12000030680 Pepsi Made With Real Sugar, 12 Ct, 12 Oz Cans
5 4561 12000030680 Pepsi Made With Real Sugar, 12 Ct, 12 Oz Cans
6 4562 12000030680 Pepsi Made With Real Sugar, 12 Ct, 12 Oz Cans
以下代码应该有效但不是:
amzp <- select(amz, ID, NozamaItemID, NozamaTitle, NozamaCustomerID)
searchTerm="cat|dog"
amzp.a <- mutate(amzp, animalFood = ifelse(grepl(searchTerm, amzp$NozamaTitle, ignore.case = TRUE) == TRUE, TRUE, FALSE))
我希望第3行看到TRUE。任何帮助都表示赞赏。感谢
答案 0 :(得分:3)
你很接近,你只需摆脱ifelse
:
amzp.a <- mutate(amzp, animalFood = grepl(searchTerm,
NozamaTitle, ignore.case = TRUE))
给出:
> amzp.a
ID NozamaItemID NozamaTitle animalFood
1 4557 12000017544 Starbucks Double Shot Espresso Light (4 Count, 6.5 Fl Oz Each) FALSE
2 4558 12000021992 Pepsi, 8Ct, 12Oz Bottle FALSE
3 4559 12000024542 Zuke'S Natural Hip Action dog Treats, 3 Oz TRUE
4 4560 12000030680 Pepsi Made With Real Sugar, 12 Ct, 12 Oz Cans FALSE
5 4561 12000030680 Pepsi Made With Real Sugar, 12 Ct, 12 Oz Cans FALSE
6 4562 12000030680 Pepsi Made With Real Sugar, 12 Ct, 12 Oz Cans FALSE
使用过的数据:
amzp <- structure(list(ID = 4557:4562,
NozamaItemID = c(12000017544, 12000021992, 12000024542, 12000030680, 12000030680, 12000030680),
NozamaTitle = structure(c(4L, 1L, 2L, 3L, 3L, 3L), .Label = c("Pepsi, 8Ct, 12Oz Bottle","Zuke'S Natural Hip Action dog Treats, 3 Oz","Pepsi Made With Real Sugar, 12 Ct, 12 Oz Cans","Starbucks Double Shot Espresso Light (4 Count, 6.5 Fl Oz Each)"), class = "factor")),
.Names = c("ID", "NozamaItemID", "NozamaTitle"), class = "data.frame", row.names = c(NA, -6L))
编辑:您的原始代码:
amzp.a <- mutate(amzp, animalFood = ifelse(grepl(searchTerm, amzp$NozamaTitle, ignore.case = TRUE) == TRUE, TRUE, FALSE))
确实有效。虽然它包含几个不需要的组件(ifelse
- 语句并在标准dplyr函数中使用data$column
),但它提供了所需的结果:
> amzp.a
ID NozamaItemID NozamaTitle animalFood
1 4557 12000017544 Starbucks Double Shot Espresso Light (4 Count, 6.5 Fl Oz Each) FALSE
2 4558 12000021992 Pepsi, 8Ct, 12Oz Bottle FALSE
3 4559 12000024542 Zuke'S Natural Hip Action dog Treats, 3 Oz TRUE
4 4560 12000030680 Pepsi Made With Real Sugar, 12 Ct, 12 Oz Cans FALSE
5 4561 12000030680 Pepsi Made With Real Sugar, 12 Ct, 12 Oz Cans FALSE
6 4562 12000030680 Pepsi Made With Real Sugar, 12 Ct, 12 Oz Cans FALSE
因此,您可能希望更详细地描述“不起作用”语句。
答案 1 :(得分:2)
我并不完全确定你想要实现的目标,但如果你的目标只是留在那些单词&#34; dog&#34;显示在NozamaTitle
列中,您只需使用dplyr::filter
即可。使用chickwts
作为示例代替最小可重现的示例:
levels(chickwts$feed)
# [1] "casein" "horsebean" "linseed" "meatmeal" "soybean"
# [6] "sunflower"
df <- filter(chickwts, grepl("bean", feed))
df
# weight feed
# 1 179 horsebean
# 2 160 horsebean
# 3 136 horsebean
# ...
# 11 243 soybean
# 12 230 soybean
# 13 248 soybean
# ...
这是你之后的事吗?