我想使用Beer
和Wine
以及Spirits
和行总数SUM
的组合,例如:
df <- data.frame(
Beer = c(1L, 1L, 1L, 3L, 4L, 0L, 0L, 3L),
Wine = c(1L, 1L, 0L, 0L, 0L, 1L, 1L, 2L),
Spirits = c(0L, 1L, 0L, 0L, 0L, 1L, 2L, 1L),
SUM = c(2L, 3L, 1L, 3L, 4L, 2L, 3L, 6L)
)
如何创建新列COMBINE
,将SUM
和列名称粘贴在一起,其值大于0.这样的事情,但任何大于5的SUM
都被视为5 +。
Beer Wine Spirits SUM COMBINE
1 1 0 2 2Beer&Wine
1 1 1 3 3Beer&Wine&Spirits
1 0 0 1 1Beer
1 0 0 1 1Beer
3 0 0 3 3Beer
4 0 0 4 4Beer
0 1 1 2 2Wine&Spirits
0 1 1 2 2Wine&Spirits
3 2 1 6 5+Beer&Wine&Spirits
对于某些添加的上下文,所有这一切的最终结果是我希望计算COMBINE
中的因子,尽管这不是我正在努力的部分。
COMBINE Count
1Beer 2
2Beer 0
3Beer 1
4Beer 1
5+Beer 0
1Wine 0
2Wine 0
.
.
.
2Wine&Spirits 2
.....
答案 0 :(得分:2)
使用ifelse
直接解决方案:
d$COMBINE <- with(d, gsub("&$", "",
paste0(ifelse(SUM > 5, "5+", SUM),
ifelse(Beer > 0, "Beer&", ""),
ifelse(Wine > 0, "Wine&", ""),
ifelse(Spirits > 0, "Spirits", ""))))
Beer Wine Spirits SUM COMBINE
1 1 1 0 2 2Beer&Wine
2 1 1 1 3 3Beer&Wine&Spirits
3 1 0 0 1 1Beer
4 3 0 0 3 3Beer
5 4 0 0 4 4Beer
6 0 1 1 2 2Wine&Spirits
7 0 1 2 3 3Wine&Spirits
8 3 2 1 6 5+Beer&Wine&Spirits
计算您可以使用的因素:table(d$COMBINE)
答案 1 :(得分:0)
您也可以使用聚合:
执行此操作d$sum5 = pmin(5, d$Beer + d$Wine + d$Spirits)
d$count = 1
aggregate(count ~ (Beer>0) + (Wine>0) + (Spirits>0) + sum5, data=d, FUN=sum)
Beer > 0 Wine > 0 Spirits > 0 sum5 count
1 TRUE FALSE FALSE 1 1
2 TRUE TRUE FALSE 2 1
3 FALSE TRUE TRUE 2 1
4 TRUE FALSE FALSE 3 1
5 FALSE TRUE TRUE 3 1
6 TRUE TRUE TRUE 3 1
7 TRUE FALSE FALSE 4 1
8 TRUE TRUE TRUE 5 1
忽略总和(举例说明并非一切都是1):
aggregate(count ~ (Beer>0) + (Wine>0) + (Spirits>0), data=d, FUN=sum)
Beer > 0 Wine > 0 Spirits > 0 count
1 TRUE FALSE FALSE 3
2 TRUE TRUE FALSE 1
3 FALSE TRUE TRUE 2
4 TRUE TRUE TRUE 2