我想按列对R中的数据帧进行排序,并将排名添加到新列中。
具体来说,我希望每天在price
下方(升序)对data.frame
列进行排名。然后,我想添加一个列,指示当天每小时的排名。
library(dplyr)
prices <- data.frame(time = c("2014-07-01 00:00:00 CEST","2014-07-01 01:00:00 CEST","2014-07-01 02:00:00 CEST","2014-07-01 03:00:00 CEST",
"2014-07-01 04:00:00 CEST","2014-07-01 05:00:00 CEST","2014-07-01 06:00:00 CEST","2014-07-01 07:00:00 CEST",
"2014-07-01 08:00:00 CEST","2014-07-01 09:00:00 CEST","2014-07-01 10:00:00 CEST","2014-07-01 11:00:00 CEST",
"2014-07-01 12:00:00 CEST","2014-07-01 13:00:00 CEST","2014-07-01 14:00:00 CEST","2014-07-01 15:00:00 CEST",
"2014-07-01 16:00:00 CEST","2014-07-01 17:00:00 CEST","2014-07-01 18:00:00 CEST","2014-07-01 19:00:00 CEST",
"2014-07-01 20:00:00 CEST","2014-07-01 21:00:00 CEST","2014-07-01 22:00:00 CEST","2014-07-01 23:00:00 CEST",
"2014-07-02 00:00:00 CEST","2014-07-02 01:00:00 CEST","2014-07-02 02:00:00 CEST","2014-07-02 03:00:00 CEST",
"2014-07-02 04:00:00 CEST","2014-07-02 05:00:00 CEST","2014-07-02 06:00:00 CEST","2014-07-02 07:00:00 CEST",
"2014-07-02 08:00:00 CEST","2014-07-02 09:00:00 CEST","2014-07-02 10:00:00 CEST","2014-07-02 11:00:00 CEST",
"2014-07-02 12:00:00 CEST","2014-07-02 13:00:00 CEST","2014-07-02 14:00:00 CEST","2014-07-02 15:00:00 CEST",
"2014-07-02 16:00:00 CEST","2014-07-02 17:00:00 CEST","2014-07-02 18:00:00 CEST","2014-07-02 19:00:00 CEST",
"2014-07-02 20:00:00 CEST","2014-07-02 21:00:00 CEST","2014-07-02 22:00:00 CEST","2014-07-02 23:00:00 CEST"),
price = c(31.75,30.54,30.10,29.32,25.97,26.90,33.59,41.06,40.99,42.44,40.00,39.94,35.69,36.00,36.00,35.17,34.94,35.18,39.00,
41.92,40.09,38.87,39.38,36.00,30.26,29.29,29.37,25.15,25.81,27.97,31.63,39.91,39.99,39.61,39.13,40.43,38.41,36.96,
36.00,34.95,33.82,36.08,38.59,39.91,39.02,36.90,38.88,32.59))
我使用arange
中的dplyr
进行排序,如下所示。
prices_sorted <- arrange(df, format(df$time, format="%Y-%m-%d"), real)
有没有&#39;清洁&#39;到达以下的方式?
prices_ranked
time price ranking
1 2014-07-01 00:00:00 CEST 31.75 5
2 2014-07-01 01:00:00 CEST 30.54 6
3 2014-07-01 02:00:00 CEST 30.10 4
4 2014-07-01 03:00:00 CEST 29.32 3
5 2014-07-01 04:00:00 CEST 25.97 2
6 2014-07-01 05:00:00 CEST 26.90 1
7 2014-07-01 06:00:00 CEST 33.59 7
8 2014-07-01 07:00:00 CEST 41.06 17
9 2014-07-01 08:00:00 CEST 40.99 16
10 2014-07-01 09:00:00 CEST 42.44 18
11 2014-07-01 10:00:00 CEST 40.00 13
12 2014-07-01 11:00:00 CEST 39.94 14
13 2014-07-01 12:00:00 CEST 35.69 15
14 2014-07-01 13:00:00 CEST 36.00 24
15 2014-07-01 14:00:00 CEST 36.00 22
16 2014-07-01 15:00:00 CEST 35.17 19
17 2014-07-01 16:00:00 CEST 34.94 23
18 2014-07-01 17:00:00 CEST 35.18 12
19 2014-07-01 18:00:00 CEST 39.00 11
20 2014-07-01 19:00:00 CEST 41.92 21
21 2014-07-01 20:00:00 CEST 40.09 9
22 2014-07-01 21:00:00 CEST 38.87 8
23 2014-07-01 22:00:00 CEST 39.38 20
24 2014-07-01 23:00:00 CEST 36.00 10
25 2014-07-02 00:00:00 CEST 30.26 4
26 2014-07-02 01:00:00 CEST 29.29 5
27 2014-07-02 02:00:00 CEST 29.37 6
28 2014-07-02 03:00:00 CEST 25.15 2
29 2014-07-02 04:00:00 CEST 25.81 3
30 2014-07-02 05:00:00 CEST 27.97 1
31 2014-07-02 06:00:00 CEST 31.63 7
32 2014-07-02 07:00:00 CEST 39.91 24
33 2014-07-02 08:00:00 CEST 39.99 17
34 2014-07-02 09:00:00 CEST 39.61 16
35 2014-07-02 10:00:00 CEST 39.13 15
36 2014-07-02 11:00:00 CEST 40.43 18
37 2014-07-02 12:00:00 CEST 38.41 22
38 2014-07-02 13:00:00 CEST 36.96 14
39 2014-07-02 14:00:00 CEST 36.00 13
40 2014-07-02 15:00:00 CEST 34.95 19
41 2014-07-02 16:00:00 CEST 33.82 23
42 2014-07-02 17:00:00 CEST 36.08 21
43 2014-07-02 18:00:00 CEST 38.59 11
44 2014-07-02 19:00:00 CEST 39.91 10
45 2014-07-02 20:00:00 CEST 39.02 8
46 2014-07-02 21:00:00 CEST 36.90 20
47 2014-07-02 22:00:00 CEST 38.88 9
48 2014-07-02 23:00:00 CEST 32.59 12
答案 0 :(得分:1)
我有点不清楚你想要什么样的订单,但这是你想要的吗?更新为按日期排名(我添加了一些其他数据以便您可以看到)
library(data.table)
prices <- data.table(time = c("2014-07-01 00:00:00 CEST", "2014-07-01 01:00:00 CEST", "2014-07-01 02:00:00 CEST","2014-07-01 03:00:00 CEST", "2014-07-01 04:00:00 CEST",
"2015-07-01 00:00:00 CEST", "2015-07-01 01:00:00 CEST", "2015-07-01 02:00:00 CEST","2015-07-01 03:00:00 CEST", "2015-07-01 04:00:00 CEST"),
price = c(31.75, 30.54, 30.10, 29.32, 25.97,31.75, 30.12, 31.10, 39.32, 25.97))
prices <- prices[,"date" := as.Date(time)]
prices.sorted <- prices[order(time),ranking := rank(price,ties.method='first'), by=date]
答案 1 :(得分:0)
也许这个:
prices %>% arrange(price) %>% mutate(ranking=min_rank(price)) %>% arrange(time)
# time price ranking
#1 2014-07-01 00:00:00 CEST 31.75 5
#2 2014-07-01 01:00:00 CEST 30.54 4
#3 2014-07-01 02:00:00 CEST 30.10 3
#4 2014-07-01 03:00:00 CEST 29.32 2
#5 2014-07-01 04:00:00 CEST 25.97 1
答案 2 :(得分:0)
这是我的解决方案,它使用基础R解决方案sort
:
prices %>% mutate(ranking = row_number(sort(price, decreasing = T)))
time price ranking
1 2014-07-01 00:00:00 CEST 31.75 5
2 2014-07-01 01:00:00 CEST 30.54 4
3 2014-07-01 02:00:00 CEST 30.10 3
4 2014-07-01 03:00:00 CEST 29.32 2
5 2014-07-01 04:00:00 CEST 25.97 1