列范围与dplyr的`mutate`相加

时间:2016-02-22 17:24:29

标签: r dplyr tidyr

我想使用public List<Map.Entry<String, Integer> sortMap(Map<String, Integer> map) { List<Map.Entry<String, Integer> elements = new LinkedList<>(map.entrySet()); Collections.sort(elements, new Comparator<Map.Entry<String, Integer>>() { public int compare(Map.Entry<String, Integer> o1, Map.Entry<String, Integer> o2 ) { return o1.getValue().compareTo(o2.getValue()); } }); } 在数据框中总结几行。

mutate

我希望最后一次调用mutate会使用# Create the data - one row per order f <- data.frame( customer = rep(c(1,2), each = 4), order_type = rep(c("direct","express","air","regular"), 2), count = sample(1:100, 8, replace = T)) # Spread the order data per-customer f <- f %>% spread(order_type, count, fill = 0) # Try to use mutate to sum up all types of orders f %>% mutate(total = select(., air:regular) %>% rowSums) air之间的行总和来填充新列。如果我在regular之外拨打select(f, air:regular) %>% rowSums,我会得到一个包含总和的向量。但是,在mutate内,我收到以下错误:

mutate

我相信我遗漏了关于Error: Position must be between 0 and n In addition: Warning messages: 1: In c(10, 14):c(96, 83) : numerical expression has 2 elements: only the first used 2: In c(10, 14):c(96, 83) : numerical expression has 2 elements: only the first used 及其评估方案的一些基本观点。

我想了解如何使用mutate执行此转换。

谢谢!

2 个答案:

答案 0 :(得分:2)

感谢@ docendo-discimus的评论,可接受的解决方案是使用tbl_dt

# Take note of the `tbl_dt` call:
f <- tbl_dt(data.frame(
  customer = rep(c(1,2), each = 4), 
  order_type = rep(c("direct","express","air","regular"), 2), 
  count = sample(1:100, 8, replace = T)))

# Spread the order data per-customer 
f <- f %>%
  spread(order_type, count, fill = 0) %>%
  mutate(total = select(., air:regular) %>% rowSums)

这需要安装data.table

另一种选择是使用使用字符串的可编程select_

# Spread the order data per-customer 
f <- f %>%
  spread(order_type, count, fill = 0) %>%
  mutate(total = select_(., "air:regular") %>% rowSums)

最后一个选项是使用数字子集:

f <- f %>%
  spread(order_type, count, fill = 0) %>%
  mutate(total = select(., 2:5) %>% rowSums)

答案 1 :(得分:1)

您可以在不加载外部包或重新整形的情况下使用xtabscbind

cbind(xtabs(count ~ customer + order_type, f),
Total = margin.table(xtabs(count ~ customer + order_type, f),1))

  air direct express regular Total
1  41     29      79      89   238
2  53     95       5      90   243
按照OP

的要求,使用dplyr

更新

将您的数据与set.seed(123)

一起使用
 f %>% spread(order_type, count, fill = 0) %>%  group_by(customer) %>%
 cbind(.,total=rowSums(.[,-1]))

  customer air direct express regular total
1        1  41     29      79      89   238
2        2  53     95       5      90   243