我在r中有一个数据框,其中的列是一个大字符串。我想使用该字符串创建具有特定值的新列。
这是示例数据框:
robots.txt
现在,如果“横幅”列的字符串包含dom <- data.frame(
Site = c("alpha", "beta", "charlie", "delta"),
Banner = c("testing_Watermelon -DPI_300x250 v2" , "notest_Vanilla Latte-DPI_300x250 v2" , "bottle :15s","aaaa vvvv cccc Build_Mobile_320x480")
)
或Watermelon
,则新列Vanilla
的值应仅为label
或Watermelon
,否则为{{1 }}。以下是预期的数据框应为什么样。
如何使用Vanilla
或其他任何方式在其中具有多个条件?
Default
答案 0 :(得分:4)
library(stringr)
dom$label = str_extract(dom$Banner, "Watermelon|Vanilla")
dom$label[is.na(dom$label)] <- "Default"
dom
# Site Banner label
# 1 alpha testing_Watermelon -DPI_300x250 v2 Watermelon
# 2 beta notest_Vanilla Latte-DPI_300x250 v2 Vanilla
# 3 charlie bottle :15s Default
# 4 delta aaaa vvvv cccc Build_Mobile_320x480 Default
答案 1 :(得分:0)
这是使用Base R的简单解决方案:
#Sample data:
dom <- data.frame(
Site = c("alpha", "beta", "charlie", "delta"),
Banner = c("testing_Watermelon -DPI_300x250 v2" , "notest_Vanilla Latte-DPI_300x250 v2" , "bottle :15s","aaaa vvvv cccc Build_Mobile_320x480")
)
dom$label <- ifelse(grepl("watermelon", dom$Banner, ignore.case = T), "Watermelon",
ifelse(grepl("vanilla", dom$Banner, ignore.case = T), "Vanilla", "Default"))
答案 2 :(得分:0)
一种base R
可能是:
labels <- paste(c("Watermelon", "Orange"), collapse = "|")
dom$label <- sapply(regmatches(dom$Banner, regexec(labels, dom$Banner)), "[", 1)
dom$label[is.na(dom$label)] <- "Default"
Site Banner label
1 alpha testing_Watermelon -DPI_300x250 v2 Watermelon
2 beta notest_Orange Latte-DPI_300x250 v2 Orange
3 charlie bottle :15s Default
4 delta aaaa vvvv cccc Build_Mobile_320x480 Default
dplyr
和tidyr
也可以使用相同的方法:
dom %>%
mutate(label = sapply(regmatches(Banner, regexec(labels, Banner)), "[", 1),
label = replace_na(label, "Default"))
样本数据:
dom <- data.frame(
Site = c("alpha", "beta", "charlie", "delta"),
Banner = c("testing_Watermelon -DPI_300x250 v2" , "notest_Orange Latte-DPI_300x250 v2" , "bottle :15s","aaaa vvvv cccc Build_Mobile_320x480")
)
答案 3 :(得分:0)
library(dplyr)
library(stringi)
dom %>% mutate(label = case_when(stri_detect_fixed(Banner, "Watermelon") ~ "Watermelon",
stri_detect_fixed(Banner, "Vanilla") ~ "Vanilla",
TRUE ~ "Default"))
#> Site Banner label
#> 1 alpha testing_Watermelon -DPI_300x250 v2 Watermelon
#> 2 beta notest_Vanilla Latte-DPI_300x250 v2 Vanilla
#> 3 charlie bottle :15s Default
#> 4 delta aaaa vvvv cccc Build_Mobile_320x480 Default
数据:
dom <- data.frame(Site = c("alpha", "beta", "charlie", "delta"),
Banner = c("testing_Watermelon -DPI_300x250 v2",
"notest_Vanilla Latte-DPI_300x250 v2",
"bottle :15s",
"aaaa vvvv cccc Build_Mobile_320x480"))