我有一个看起来像这样的东西的数据框:
Day Salesperson Value
==== ============ =====
Monday John 40
Monday Sarah 50
Tuesday John 60
Tuesday Sarah 30
Wednesday John 50
Wednesday Sarah 40
我想将每个销售人员的价值除以每周的每一天发生的次数。所以:有3个星期一,3个星期二和2个星期三 - 我没有数字化这个信息,但可以创建一个向量
的向量c(3, 3, 2)
如何根据每天发生的次数有条件地划分值列?
我找到了一个不太优雅的解决方案,需要将日列复制到 temp 列,替换新列中每周的每个名称使用
每天发生的次数df$temp <- sub("Monday, 3, df$temp)
但这样做似乎有点笨重。有没有一个巧妙的方法来做到这一点?
答案 0 :(得分:2)
您可以使用库aux <- df %>% group_by(Day) %>% summarise(n=n())
> output
Day Salesperson Value n Value2
1 Monday John 40 2 20
2 Monday Sarah 50 2 25
3 Tuesday John 60 2 30
4 Tuesday Sarah 30 2 15
5 Wednesday John 50 2 25
6 Wednesday Sarah 40 2 20
将数据框与每天的频率合并。
value
使用原始数据中显示的天数创建此辅助表,而不是手动执行。你可以使用:
mutate(Value=Value/n)
如果您要替换实际的select(-n)
列,然后使用output <- df %>% left_join(aux, by="Day") %>% mutate(Value=Value/n) %>% select(-n)
并删除其他列,则可以添加mdl-cell--hide-desktop
mdl-cell--hide-tablet
mdl-cell--hide-phone
body {
height: 1920px;
}
.h1 {
position: fixed;
color: #333;
}
.h2 {
position: relative;
color: red;
}
答案 1 :(得分:2)
假设您的辅助数据位于另一个data.frame:
Day N_Day
1 Monday 3
2 Tuesday 3
3 Wednesday 2
最简单的方法是合并:
DF_new <- merge(DF, DF2, by="Day")
DF_new$newcol <- DF_new$Value / DF_new$N_Day
给出了
Day Salesperson Value N_Day newcol
1 Monday John 40 3 13.33333
2 Monday Sarah 50 3 16.66667
3 Tuesday John 60 3 20.00000
4 Tuesday Sarah 30 3 10.00000
5 Wednesday John 50 2 25.00000
6 Wednesday Sarah 40 2 20.00000
无合并快捷方式是
DF$newcol <- DF$Value / DF2$N_Day[match(DF$Day, DF2$Day)]
数据:强>
DF <- structure(list(Day = structure(c(1L, 1L, 2L, 2L, 3L, 3L), .Label =
c("Monday",
"Tuesday", "Wednesday"), class = "factor"), Salesperson = structure(c(1L,
2L, 1L, 2L, 1L, 2L), .Label = c("John", "Sarah"), class = "factor"),
Value = c(40L, 50L, 60L, 30L, 50L, 40L)), .Names = c("Day",
"Salesperson", "Value"), class = "data.frame", row.names = c(NA,
-6L))
DF2 <- structure(list(Day = structure(1:3, .Label = c("Monday", "Tuesday",
"Wednesday"), class = "factor"), N_Day = c(3, 3, 2)), .Names = c("Day",
"N_Day"), row.names = c(NA, -3L), class = "data.frame")