令人尴尬的基本问题,但如果你不知道......我需要重新设计一个计数汇总数据的数据框,以便在汇总之前看起来像是什么样子。这基本上与{plyr} count()
相反,例如
> (d = data.frame(value=c(1,1,1,2,3,3), cat=c('A','A','A','A','B','B')))
value cat
1 1 A
2 1 A
3 1 A
4 2 A
5 3 B
6 3 B
> (summry = plyr::count(d))
value cat freq
1 1 A 3
2 2 A 1
3 3 B 2
如果您从summry
开始,回到d
的最快捷方式是什么?除非我错了(非常可能),{Reshape2}
不会这样做..
答案 0 :(得分:2)
只需使用rep
:
summry[rep(rownames(summry), summry$freq), c("value", "cat")]
# value cat
# 1 1 A
# 1.1 1 A
# 1.2 1 A
# 2 2 A
# 3 3 B
# 3.1 3 B
可以在expandRows
的my "SOfun" package中找到此方法的变体形式。如果您已经加载了,您可以简单地执行:
expandRows(summry, "freq")
答案 1 :(得分:1)
R cookbook website上的数据框功能有一个很好的表格,你可以稍微修改一下。唯一的修改是改变' Freq' - > '频率' (与plyr::count
一致)并确保将rownames重置为增加的整数。
expand.dft <- function(x, na.strings = "NA", as.is = FALSE, dec = ".") {
# Take each row in the source data frame table and replicate it
# using the Freq value
DF <- sapply(1:nrow(x),
function(i) x[rep(i, each = x$freq[i]), ],
simplify = FALSE)
# Take the above list and rbind it to create a single DF
# Also subset the result to eliminate the Freq column
DF <- subset(do.call("rbind", DF), select = -freq)
# Now apply type.convert to the character coerced factor columns
# to facilitate data type selection for each column
for (i in 1:ncol(DF)) {
DF[[i]] <- type.convert(as.character(DF[[i]]),
na.strings = na.strings,
as.is = as.is, dec = dec)
}
row.names(DF) <- seq(nrow(DF))
DF
}
expand.dft(summry)
value cat
1 1 A
2 1 A
3 1 A
4 2 A
5 3 B
6 3 B