我正在使用原始行输入格式UID = character,Win / Lose = Boolean执行clickstream日志文件摘要。我想要创建的输出摘要的形式为行UID,sumWin,sumLose。我已经使用表来获取我想要的部分内容,但是我无法找到正确的语法来从表结果中提取因子标签以便在摘要df中使用。下面的示例构建了一个很小的测试用例,并显示了我遇到的问题:我无法从表结果中获取因子标签。 (当然,你认为有更好的方式来处理整个事情 - 这显然也非常有用!)
我仍然无法在编辑器中进行格式化 - 显然这是我接下来要问的问题......!
foo <- data.frame(Uid=character(4), Win=logical(4), stringsAsFactors=FALSE)
foo$Uid <- c("UidA", "UidB", "UidA", "UidC")
foo$Win <- c(FALSE, TRUE, TRUE, FALSE)
#display foo
foo
Uid Win
1 UidA FALSE
2 UidB TRUE
3 UidA TRUE
4 UidC FALSE
# my desired summary df is, for each UID: NWin (foo$Win=TRUE), NRunUp (foo$Win=FALSE)
# here I initialise a holder for it
fooNUniques <- length(unique(foo$Uid))
fooSummary <- data.frame(Uids=character(fooNUniques),NWins=numeric(fooNUniques),NRunUps=numeric(fooNUniques))
fooSummary
Uids NWins NRunUps
1 0 0
2 0 0
3 0 0
#I can reference in to the result of applying table to get part of what I want
#First I get the table, this gets me a table by win/lose value
fooTable <- table(foo$Uid, foo$Win)
fooTable
FALSE TRUE
UidA 1 1
UidB 0 1
UidC 1 0
# I can get at the actual results via unname which gives me a matrix
fooTableAsMat <- unname(fooTable)
fooTableAsMat
[,1] [,2]
[1,] 1 1
[2,] 0 1
[3,] 1 0
#but the UID vec is hidden in the table structure *somewhere* and
# I can't work out how to reference it out
#coercing the result to a dataFrame doesn't work
as.data.frame(fooTable)
Var1 Var2 Freq
1 UidA FALSE 1
2 UidB FALSE 0
3 UidC FALSE 1
4 UidA TRUE 1
5 UidB TRUE 1
6 UidC TRUE 0
#I have also tried 'aggregate' but have not made friends with it
答案 0 :(得分:1)
这有帮助吗?
使用plyr
:
> ddply(foo, .(Uid), summarise, NWin = sum(Win), NRunUp = sum(!Win))
# Uid NWin NRunUp
# 1 UidA 1 1
# 2 UidB 1 0
# 3 UidC 0 1