下面的数据框显示了不同的公司,他们收到的产品以及他们收到的日期。我想要做的是让“Time.Rank”列反映同一Account.Name中Received.Date发生的时间顺序(从最早 - 最晚的日期)。
df <- data.frame(
Company = c("Walmart", "Walmart", "Walmart", "Walmart", "Walmart", "Staples", "Staples"),
Product.Name = c("tape", "flower", "tape", "chocolate", "pencil", "pencil", "tape"),
Received.Date = c("2013-09-30", "2013-09-30", "2015-05-08", "2015-05-08", "2015-05-08", "2014-12-12", "2014-12-17"),
Time.Rank = c("1", "2", "3", "4", "5", "1", "2"))
我的问题是关于Time.Rank列。我如何得到Time.Rank专栏:
df <- data.frame %>%
mutate(Time.Rank = row_number(Account.Name))
现在的问题是,即使第1-2行和第3-5行都有相同的Received.Date,它们仍然有不同的排名。我希望具有相同Received.Date的行具有排名。即第1行和第2行都应该有Time.Rank = 1,第3-5行应该有Time.Rank = 2.所以这个:
df <- data.frame(
Company = c("Walmart", "Walmart", "Walmart", "Walmart", "Walmart", "Staples", "Staples"),
Product.Name = c("tape", "flower", "tape", "chocolate", "pencil", "pencil", "tape"),
Received.Date = c("2013-09-30", "2013-09-30", "2015-05-08", "2015-05-08", "2015-05-08", "2014-12-12", "2014-12-17"),
Time.Rank = c("1", "1", "2", "2", "2", "1", "2"))
答案 0 :(得分:3)
我认为您正在寻找的是dense_rank
df %>% group_by(Company) %>% mutate(Time.Rank = dense_rank(Received.Date))
Company Product.Name Received.Date Time.Rank
1 Walmart tape 2013-09-30 1
2 Walmart flower 2013-09-30 1
3 Walmart tape 2015-05-08 2
4 Walmart chocolate 2015-05-08 2
5 Walmart pencil 2015-05-08 2
6 Staples pencil 2014-12-12 1
7 Staples tape 2014-12-17 2