我有一个数据帧<tr>
<td nowrap="true" valign="top" width="190px" class="ms-formlabel"><h3 class="ms-standardheader">
<nobr>All employees in department</nobr>
</h3></td>
<td valign="top" class="ms-formbody">
<!-- FieldName="All employees"
FieldInternalName="All_x0020_employees_x0020_in_x00"
FieldType="SPFieldBoolean"
-->
<span dir="none">
<input id="ctl00_m_g_49618ec6_4999_44aa_87e7_6087a1cf4a6f_ctl00_ctl05_ctl00_ctl00_ctl00_ctl04_ctl00_ctl00_BooleanField"
type="checkbox" name="ctl00$m$g_49618ec6_4999_44aa_87e7_6087a1cf4a6f$ctl00$ctl05$ctl00$ctl00$ctl00$ctl04$ctl00$ctl00$BooleanField" /><br />
</span>
select employees !
</td>
,它具有以下结构:
df
我正在尝试为每个NEW_UPC IRI_KEY WEEK DOLLARS
13000016961 272568 1220 3.29
13000016961 272568 1221 3.29
13000016961 272568 1222 3.29
13000016961 272568 1223 9.87
13000016962 272568 1224 3.29
13000016961 272568 1224 9.87
13000016962 272568 1225 3.29
13000016961 272568 1225 9.87
13000016962 272568 1226 3.29
13000016961 272568 1226 9.87
13000016961 272568 1227 9.87
13000016961 272568 1228 3.29
13000016963 272568 1228 3.29
13000016963 272568 1229 3.29
13000016962 272568 1230 3.29
13000016961 272568 1230 3.29
13000016963 272568 1230 13.16
13000016962 272568 1231 3.29
13000016963 272568 1231 9.87
21600016430 272568 1231 17.43
13000016962 272568 1232 9.87
-DOLLARS
组合获取前12周的NEW_UPC
之和。我尝试了以下代码:
IRI_KEY
但是,我收到以下错误消息:
df %>%
group_by(NEW_UPC,IRI_KEY) %>%
mutate(START = min(WEEK), END = max(WEEK)) %>% ungroup() %>%
group_by(NEW_UPC,IRI_KEY) %>%
summarise(Sales = case_when(WEEK<=(START+12) ~ sum(DOLLARS)))
我在这里做什么错了?
已编辑:Error in summarise_impl(.data, dots) :
Column `Sales` must be length 1 (a summary value), not 8
列中的值更改为实际总计,以避免在注释中引起混淆。
我想要获得的最终输出如下:
Sales
请注意,上面NEW_UPC IRI_KEY Sales
13000016961 272568 65.8
13000016962 272568 26.3
13000016963 272568 29.6
21600016430 272568 17.4
列中的值只是我用来显示输出结构的随机数。另外,如果Sales
在NEW_UPC
起超过12周的时间内具有DOLLARS
的值,那么我只想获取前12周的总数。因此,START
列应返回到Sales
前十二周的总数。或者,如果START
的值NEW_UPC
距DOLLARS
不到12周,则START
应该返回该期间的总数。
答案 0 :(得分:1)
您即将解决。您可以在WEEK
上对数据进行排序,然后排名前列(head
)12将为您提供前12周的数据。您可以尝试:
library(dplyr)
df %>%
group_by(NEW_UPC,IRI_KEY) %>%
arrange(WEEK) %>%
summarise(Sales = sum(head(DOLLARS,12)))
# # A tibble: 4 x 3
# # Groups: NEW_UPC [?]
# NEW_UPC IRI_KEY Sales
# <dbl> <int> <dbl>
# 1 13000016961 272568 65.8
# 2 13000016962 272568 26.3
# 3 13000016963 272568 29.6
# 4 21600016430 272568 17.4
数据:
df <- read.table(text="
NEW_UPC IRI_KEY WEEK DOLLARS
13000016961 272568 1220 3.29
13000016961 272568 1221 3.29
13000016961 272568 1222 3.29
13000016961 272568 1223 9.87
13000016962 272568 1224 3.29
13000016961 272568 1224 9.87
13000016962 272568 1225 3.29
13000016961 272568 1225 9.87
13000016962 272568 1226 3.29
13000016961 272568 1226 9.87
13000016961 272568 1227 9.87
13000016961 272568 1228 3.29
13000016963 272568 1228 3.29
13000016963 272568 1229 3.29
13000016962 272568 1230 3.29
13000016961 272568 1230 3.29
13000016963 272568 1230 13.16
13000016962 272568 1231 3.29
13000016963 272568 1231 9.87
21600016430 272568 1231 17.43
13000016962 272568 1232 9.87",
header = TRUE)