好的,我有以下数据框,有数千行,数据框的输出如下。此数据框记录电子商务网站上的订单,它列出了为每个订单ID购买的产品
| order_id| product_id|product_name |
|--------:|----------:|:--------------------------------|
| 1187899| 196|Soda |
| 1187899| 25133|Organic String Cheese |
| 1187899| 38928|0% Greek Strained Yogurt |
| 1187899| 26405|XL Pick-A-Size Paper Towel Rolls |
| 1187899| 39657|Milk Chocolate Almonds |
| 1187899| 10258|Pistachios |
| 1187899| 13032|Cinnamon Toast Crunch |
| 1187899| 26088|Aged White Cheddar Popcorn |
| 1187899| 27845|Organic Whole Milk |
| 1187899| 49235|Organic Half & Half |
| 1187899| 46149|Zero Calorie Cola |
| 1492625| 22963|Organic Roasted Turkey Breast |
| 1492625| 7963|Gluten Free Whole Grain Bread |
| 1492625| 16589|Plantain Chips |
| 1492625| 32792|Chipotle Beef & Pork Realstick |
用于列出上述数据框的代码是:
temp <- orders %>%
inner_join(opt,by="order_id") %>%
inner_join(products,by="product_id") %>%
select(order_id,product_id,product_name)
kable(head(temp,15))
我想计算最有序的产品,基本上,我的输出应该是这样的:
product_id | Order_Count
196 10025
7963 9025
25133 8903
我无法弄清楚如何解决这个问题,我已经尝试过:
mutate(prods = count(product_id))
但它没有用,我收到了一个错误说:Error in mutate_impl(.data, dots) :
Evaluation error: no applicable method for 'groups' applied to an object of class "factor".
任何帮助将不胜感激!
答案 0 :(得分:0)
您可以使用table()
打印一张简单的表格(如Rui Barradas所述),或者如果您想要一个带有计数的数据框,请使用dplyr::count()
。
library(tidyverse)
orders <- tibble::tribble(
~order_id, ~product_id, ~product_name,
"1187899", "196", "Soda",
"1187899", "25133", "Organic String Cheese",
"1187899", "38928", "0% Greek Strained Yogurt",
"1187899", "26405", "XL Pick-A-Size Paper Towel Rolls",
"1187899", "39657", "Milk Chocolate Almonds",
"1187899", "10258", "Pistachios",
"1187899", "10258", "Pistachios",
"1187899", "10258", "Pistachios",
"1187899", "13032", "Cinnamon Toast Crunch",
"1187899", "13032", "Cinnamon Toast Crunch",
"1187899", "26088", "Aged White Cheddar Popcorn",
"1187899", "27845", "Organic Whole Milk",
"1187899", "49235", "Organic Half & Half",
"1187899", "46149", "Zero Calorie Cola",
"1492625", "22963", "Organic Roasted Turkey Breast",
"1492625", "7963", "Gluten Free Whole Grain Bread",
"1492625", "16589", "Plantain Chips",
"1492625", "32792", "Chipotle Beef & Pork Realstick"
)
一个简单的打印表,其中包含(例如)每个product_id计数
table(orders$product_id)
但是如果你想要一个带有计数的数据框,要绘制或用于任何事情,那么
orders %>%
count(product_id, product_name)
> + # A tibble: 15 x 3
> product_id product_name n
> <chr> <chr> <int>
> 1 10258 Pistachios 3
> 2 13032 Cinnamon Toast Crunch 2
> 3 16589 Plantain Chips 1
> 4 196 Soda 1
> 5 22963 Organic Roasted Turkey Breast 1
> 6 25133 Organic String Cheese 1
> 7 26088 Aged White Cheddar Popcorn 1
> 8 26405 XL Pick-A-Size Paper Towel Rolls 1
> 9 27845 Organic Whole Milk 1
> 10 32792 Chipotle Beef & Pork Realstick 1
> 11 38928 0% Greek Strained Yogurt 1
> 12 39657 Milk Chocolate Almonds 1
> 13 46149 Zero Calorie Cola 1
> 14 49235 Organic Half & Half 1
> 15 7963 Gluten Free Whole Grain Bread 1