我想确定在最终报告中制作摘要文本的最佳流程。
x <- tribble(
~year, ~service, ~account, ~amount,
"2001", "Army", "operations", 5000000,
"2001", "Navy", "operations", 1500000,
"2002", "Army", "operations", 6000000,
"2002", "Navy", "operations", 1700000,
"2001", "Army", "repair", 500000,
"2001", "Navy", "repair", 300000,
"2002", "Army", "repair", 400000,
"2002", "Navy", "repair", 600000)
每项服务的所需文字。
"Between [year.min] and [year.max], the [service]
spent an average of [average amount]. The largest account
in terms of spending within the [service] was [account],
which ranked [rank] and fluctuated between [min amount]
and [max amount], with a high of [max amount] in [year] to
a low of [min] in [year]."
所需的输出将在表格中。该过程将在许多子级(帐户,子帐户等)重复进行。
service summary_text
<chr> <chr>
1 Army concatenated
2 Navy concatenated
最终,我想将结果导出为迷你图旁边的html表,这在Excel中相当简单。
service sparkline summary_text
<chr> <chr> <chr>
1 Army sparkline concatenated text
2 Navy sparkline concatenated text
答案 0 :(得分:3)
将dplyr
和glue
与不同的分组策略结合使用:
library(dplyr)
library(glue)
output <- x %>%
group_by(service,account) %>%
mutate(amount_sum = sum(amount)) %>%
group_by(service) %>%
mutate(average.amount=mean(amount)) %>%
filter(amount_sum == max(amount_sum)) %>%
summarize(
year.min=min(year),
year.max=max(year),
average.amount=first(average.amount),
account=first(account),
rank=1,
min.amount =min(amount),
max.amount=max(amount),
year.min.amount = year[which.min(amount)],
year.max.amount = year[which.max(amount)]) %>%
transmute(service,
summary_text= glue("Between {year.min} and {year.max}, the {service}
spent an average of {average.amount}. The largest account
in terms of spending within the {service} was {account},
which ranked {rank} and fluctuated between {min.amount}
and {max.amount}, with a high of {max.amount} in {year.max.amount} to
a low of {min.amount} in {year.min.amount}."))
output %>% pull(summary_text)
# Between 2001 and 2002, the Army
# spent an average of 2975000. The largest account
# in terms of spending within the Army was operations,
# which ranked NA and fluctuated between 5e+06
# and 6e+06, with a high of 6e+06 in 2002 to
# a low of 5e+06 in 2001.
# Between 2001 and 2002, the Navy
# spent an average of 1025000. The largest account
# in terms of spending within the Navy was operations,
# which ranked NA and fluctuated between 1500000
# and 1700000, with a high of 1700000 in 2002 to
# a low of 1500000 in 2001.
如果要限制外部库依赖项,可以使用paste
或sprintf
代替glue
,但这样的示例更具可读性。
在此示例中,我假设rank
始终为1
。如果您想要处理子帐户,我建议您在summarize
调用group_by
和mutate
之前使用与我相同的技巧,这样您就可以按组创建新的列常量。然后在first
中拨打summarize
。
答案 1 :(得分:0)
Moody Mudskipper的答案有点火花。
library(tidyverse)
library(sparkline)
library(formattable)
library(glue)
#Data
x <- tribble(
~year, ~service, ~account, ~amount,
"2001", "Army", "operations", 5000000,
"2001", "Navy", "operations", 1500000,
"2002", "Army", "operations", 6000000,
"2002", "Navy", "operations", 1700000,
"2001", "Army", "repair", 500000,
"2001", "Navy", "repair", 300000,
"2002", "Army", "repair", 400000,
"2002", "Navy", "repair", 600000)
# Assemble Text
table <- x %>%
group_by(service, year) %>%
summarise(total = sum(amount)) %>%
group_by(service) %>%
summarise(mean_annual_service = mean(total),
# years range
first.year = min(year),
last.year = max(year),
# min and max years, amounts
year.min= year[which.min(total)],
year.max = year[which.max(total)],
min.amount = total[which.min(total)],
max.amount = total[which.max(total)]) %>%
# Final Text
mutate(Description = glue('Between {first.year} and {last.year},
the average spending in the {service} was
${prettyNum(mean_annual_service, big.mark = ",")},
with a high of ${prettyNum(max.amount, big.mark = ",")} in {year.max}, and a low of
${prettyNum(min.amount, big.mark = ",")} in {year.min}') ) %>%
select(service, Description)
# Add Sparkline
x %>%
group_by(service, year) %>%
summarise(total = sum(amount)) %>%
summarise(
Sparkline = spk_chr(
total,
type = "line",
chartRangeMin=min(total),
chartRangeMax=max(total))) %>%
left_join(table) %>%
formattable() %>%
as.htmlwidget() %>%
spk_add_deps()