我有一个数据框,显示选区和党派的选举结果。我需要找到每个选区投票最多的一方。
我的df看起来像这样
# gss party votes
1 W07000049 Labour 22662
2 W07000049 Conservative 5901
3 W07000049 LibDem 941
3 W07000058 Labour 5951
3 W07000058 LibDem 1741
3 W07000058 Conservative 852
我想将它投射出来,这样唯一的派对名称就成了我的列名,就像这样
# gss Labour Conservative LibDem
1 W07000049 22662 5901 941
2 W07000058 5951 1741 941
在这个数据框架上,我可以使用which.max,如此
x$win <- colnames(df)[apply(df, 1, function(x) which.max(x)[1])]
我尝试过使用reshape2 http://seananderson.ca/2013/10/19/reshape.html中的dcast,但无法应用它。我怎样才能找到每个选区的胜利方?
P.S。我是初学者,所以如果我能更好地解释,请告诉我
答案 0 :(得分:1)
以下是reshape2::dcast
解决方案:
dcast(df, df[, 2] ~ df[, 3])
# Output
# 1 W07000049 5901 22662 941
# 2 W07000058 852 5951 1741
这假设df
str(df)
#'data.frame': 6 obs. of 4 variables:
# $ V1: int 1 2 3 3 3 3
# $ V2: Factor w/ 2 levels "W07000049","W07000058": 1 1 1 2 2 2
# $ V3: Factor w/ 3 levels "Conservative",..: 2 1 3 2 3 1
# $ V4: int 22662 5901 941 5951 1741 852
答案 1 :(得分:0)
另一个reshape2::dcast
解决方案。
library(reshape2)
molten <- melt(df)
dcast(molten, gss ~ party, id.vars = "gss", value.var = "value")
# gss Conservative Labour LibDem
#1 W07000049 5901 22662 941
#2 W07000058 852 5951 1741
请注意,第一步是必要的,但您可以跳过创建中间数据框molten
并简单地执行单行dcast(melt(...)...)
。
数据。强>
df <-
structure(list(gss = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("W07000049",
"W07000058"), class = "factor"), party = structure(c(2L, 1L,
3L, 2L, 3L, 1L), .Label = c("Conservative", "Labour", "LibDem"
), class = "factor"), votes = c(22662L, 5901L, 941L, 5951L, 1741L,
852L)), .Names = c("gss", "party", "votes"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))