我的实验数据的格式如下:
df <-data.frame(product_path = c("https://mycommerece.com/product/book/miracle", "https://mycommerece.com/product/book/miracle2", "https://mycommerece.com/product/gadget/airplane", "https://mycommerece.com/product/book/miracle3"), var1 = c(1,1,1,0), commereceurl = c("https://mycommerece.com/product/","https://mycommerece.com/product/","https://mycommerece.com/product2/","https://www.test.com"), var2 = c(1,0,0,1))
> df
product_path var1 commereceurl var2
1 https://mycommerece.com/product/book/miracle 1 https://mycommerece.com/product/ 1
2 https://mycommerece.com/product/book/miracle2 1 https://mycommerece.com/product/ 0
3 https://mycommerece.com/product/gadget/airplane 1 https://mycommerece.com/product2/ 0
4 https://mycommerece.com/product/book/miracle3 0 https://www.test.com 1
使用来自commereceurl列的数据我想删除特定行中的值不以“https://mycommerece.com”开头的行
输出示例
df <-data.frame(product_path = c("https://mycommerece.com/product/book/miracle", "https://mycommerece.com/product/book/miracle2", "https://mycommerece.com/product/gadget/airplane"), var1 = c(1,1,1), commereceurl = c("https://mycommerece.com/product/","https://mycommerece.com/product/","https://mycommerece.com/product2/"), var2 = c(1,0,0))
如何实施此规则?
答案 0 :(得分:3)
您可以使用grep
KEEP = grep("^https://mycommerece.com", df$commereceurl)
df = df[KEEP,]