我已经阅读了多个问题而没有找到一些有效的代码,所以感谢您的帮助。这是对早期问题的改进,虽然我可以在Excel中执行此操作,但我正在尝试让我的R加速。
我有一些令我头痛的销售数据:
date sales
14/11 39
14/11 3.2
14/11 13
14/11 8.3
14/11 5
14/11 5.6
14/11 79
14/11 35
14/11 24
14/11 8.1
14/11 21
14/11 40
14/11 50
14/11 82
15/11 8.3
15/11 7.2
15/11 63
15/11 31
15/11 35
15/11 2.1
15/11 31
15/11 11
15/11 3.8
15/11 29
15/11 NA
我已经展示了如何对日期进行分组并找到最后三位表演者,但我希望其余的数据可见。
我希望看到另一个列中排名最低的三个销售报告的列表示为TRUE,如果不是,则为FALSE
我试过了:
if(data$sales == group_by(data$date)%>%top_n(n=-3, wt=sales)) {
data$top <- T
} else {
dat$top <- F
}
我得到的只是:
Error in UseMethod("group_by_") :
no applicable method for 'group_by_' applied to an object of class "factor"
这不是第一次尝试 - 我已尝试过循环,如果|否则,匹配,%in%并且真的很挣扎,但是不想在这里抛弃一堆坏代码。
任何想法都非常感激。
答案 0 :(得分:0)
Hope this helps!
library(dplyr)
df %>%
group_by(date) %>%
arrange(date, sales) %>%
mutate(bottom3_performer = row_number() <=3)
Output is:
date sales bottom3_performer
1 14/11 3.2 TRUE
2 14/11 5.0 TRUE
3 14/11 5.6 TRUE
4 14/11 8.1 FALSE
5 14/11 8.3 FALSE
6 14/11 13.0 FALSE
7 14/11 21.0 FALSE
...
Sample data:
df <- structure(list(date = c("14/11", "14/11", "14/11", "14/11", "14/11",
"14/11", "14/11", "14/11", "14/11", "14/11", "14/11", "14/11",
"14/11", "14/11", "15/11", "15/11", "15/11", "15/11", "15/11",
"15/11", "15/11", "15/11", "15/11", "15/11", "15/11"), sales = c(39,
3.2, 13, 8.3, 5, 5.6, 79, 35, 24, 8.1, 21, 40, 50, 82, 8.3, 7.2,
63, 31, 35, 2.1, 31, 11, 3.8, 29, NA)), .Names = c("date", "sales"
), class = "data.frame", row.names = c(NA, -25L))
Another set of sample data & o/p:
df <- structure(list(date = c("14/11", "14/11", "14/11", "14/11", "14/11",
"14/11", "14/11", "14/11", "14/11", "14/11", "14/11", "14/11",
"14/11", "14/11", "15/11", "15/11", "15/11", "15/11", "15/11",
"15/11", "15/11", "15/11", "15/11", "15/11", "15/11"), sales = c(39,
3.2, 13, 8.3, 5, 5.6, 79, 35, 24, 8.1, 21, 40, 50, 82, 8.3, 7.2,
63, 31, 35, 2.1, 31, 11, 3.8, 29, NA), id = 1:25, name = c("a",
"b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n",
"o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y")), .Names = c("date",
"sales", "id", "name"), row.names = c(NA, -25L), class = "data.frame")
date sales id name bottom3_performer
1 14/11 3.2 2 b TRUE
2 14/11 5.0 5 e TRUE
3 14/11 5.6 6 f TRUE
4 14/11 8.1 10 j FALSE
5 14/11 8.3 4 d FALSE
6 14/11 13.0 3 c FALSE
7 14/11 21.0 11 k FALSE
...
答案 1 :(得分:0)
This should do it:
library(dplyr)
df %>%
group_by(date) %>%
mutate(bottom3 = ifelse(rank(sales) <= 3, TRUE, FALSE))
# A tibble: 25 x 3
# Groups: date [2]
date sales bottom3
<chr> <dbl> <lgl>
1 15/11 2.10 T
2 14/11 3.20 T
3 15/11 3.80 T
4 14/11 5.00 T
5 14/11 5.60 T
6 15/11 7.20 T
7 14/11 8.10 F
8 14/11 8.30 F
9 15/11 8.30 F
10 15/11 11.0 F
# ... with 15 more rows