Question

我正在尝试查看我的用户是否针对所有下达的订单订购了相同的产品。

我的数据集看起来像这样-

Users  Product Ordered  
A        Onion                
A        Onion                
A        Onion                
B        Carrots              
B        Carrots              
B        Spinach

我想创建一个名为订购相同商品的新列吗？

Users  Product Ordered   Ordered the same thing?
A        Onion                Y
A        Onion                Y
A        Onion                Y
B        Carrots              N
B        Carrots              N
B        Spinach              N

Answer 1

我们可以使用n_distinct

进行检查

library(dplyr)
df1 %>%
  group_by(Users) %>%
  mutate(OrderedtheSamething = n_distinct(ProductOrdered)==1)

它返回一个逻辑列（比“ Y / N”更可取）。但是，如果需要，可以将mutate步骤更改为

df1 %>%
  group_by(Users) %>%
  mutate(OrderedtheSamething = c("N", "Y")[(n_distinct(ProductOrdered)==1) +1])

与data.table类似的选项将是

library(data.table)
setDT(df1)[, OrderedtheSamething := uniqueN(ProductOrdered)==1, by = Users]

或将base R与table一起使用

df1$OrderedtheSamething = df1$Users %in% names(which(rowSums(table(df1) > 
                        0) == 1))

如何查看键在整个时间是否具有相同的值

1 个答案: