我有一个如下所示的数据集:
id profile_id company product price
1 1 A book 10.42
2 1 A shirt 23.91
3 1 A cup 5.95
4 2 B book 7.99
5 2 B shirt 5.95
6 2 B cup 11.76
我想创建一个新列“rank”,它显示每个产品,每个公司和每个profile_id的价格等级。
输出如下:
id profile_id company product price rank
1 1 A book 10.42 2
2 1 A shirt 23.91 3
3 1 A cup 5.95 1
4 2 B book 7.99 2
5 2 B shirt 5.95 1
6 2 B cup 11.76 3
我觉得这应该很容易,但我不能真正让这个工作......任何帮助将不胜感激!
可重现的代码:
df2 <- data.frame(id=c(1,2,3,4,5,6),
profile_id = c(1, 1, 1, 2, 2,2),
company = c("A","A","A","B","B","B"),
product = c("book", "shirt", "cup","book", "shirt", "cup"),
price = c(10.42, 23.91, 5.95, 7.99, 5.95, 11.76))
答案 0 :(得分:1)
首先group_by“per company,profile_id”变量然后应用rank():
library(dplyr)
df %>% group_by(company, profile_id) %>% mutate(rank = rank(price))
library(data.table)
df[,rank:=rank(price),by = .(company, profile_id)]
# id profile_id company product price rank
#1 1 1 A book 10.42 2
#2 2 1 A shirt 23.91 3
#3 3 1 A cup 5.95 1
#4 4 2 B book 7.99 2
#5 5 2 B shirt 5.95 1
#6 6 2 B cup 11.76 3
答案 1 :(得分:0)
我们可以使用base R
来执行此操作
df$rank <- with(df, ave(price, company, profile_id, FUN = rank))