我有这样的数据:
> head(df)
Date IsWin
20 2014-07-13 00:00:00 True
21 2014-08-01 00:00:00 True
22 2014-08-05 00:00:00 False
23 2014-06-28 00:00:00 True
24 2014-05-31 00:00:00 True
25 2014-06-06 00:00:00 True
我想通过IsWin按日期和总和进行分组(应该是1或-1的因子)。
我已经读过这篇文章,但它并没有真正处理因素,因此我不知道如何应用它How to group a data.frame by date?
最后,我想将分组和汇总的数据传递给条形图,以显示获胜或亏损的数量,例如ggplot2 and a Stacked Bar Chart with Negative Values
以下输出一张表,非常有助于查看我想要的内容;但是,我想将其翻译成条形图以获得更好的视觉效果:
> table(df[,1],df[,2])
False True
2014-05-25 00:00:00 1 0
2014-05-29 00:00:00 1 0
2014-05-30 00:00:00 2 0
2014-05-31 00:00:00 0 1
2014-06-06 00:00:00 0 1
2014-06-13 00:00:00 1 0
2014-06-14 00:00:00 0 1
2014-06-18 00:00:00 1 0
2014-06-19 00:00:00 0 1
2014-06-23 00:00:00 1 0
2014-06-24 00:00:00 1 0
2014-06-25 00:00:00 1 0
2014-06-27 00:00:00 0 1
2014-06-28 00:00:00 1 2
2014-07-02 00:00:00 1 0
2014-07-11 00:00:00 1 0
2014-07-13 00:00:00 0 2
2014-07-31 00:00:00 0 1
2014-08-01 00:00:00 0 1
2014-08-05 00:00:00 1 0
2014-08-07 00:00:00 1 0
2014-08-12 00:00:00 0 1
这是我的实际结构:
df <- structure(list(Date = c("2014-07-13 00:00:00", "2014-08-01 00:00:00",
"2014-08-05 00:00:00", "2014-06-28 00:00:00", "2014-05-31 00:00:00",
"2014-06-06 00:00:00", "2014-06-14 00:00:00", "2014-05-25 00:00:00",
"2014-06-24 00:00:00", "2014-06-28 00:00:00", "2014-05-30 00:00:00",
"2014-06-18 00:00:00", "2014-07-02 00:00:00", "2014-07-11 00:00:00",
"2014-05-29 00:00:00", "2014-06-19 00:00:00", "2014-07-31 00:00:00",
"2014-06-27 00:00:00", "2014-06-23 00:00:00", "2014-05-30 00:00:00",
"2014-07-13 00:00:00", "2014-08-12 00:00:00", "2014-06-13 00:00:00",
"2014-06-25 00:00:00", "2014-06-28 00:00:00", "2014-08-07 00:00:00"
), IsWin = structure(c(2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L
), .Label = c("False", "True"), class = "factor")), .Names = c("Date",
"IsWin"), row.names = 20:45, class = "data.frame")
答案 0 :(得分:1)
尝试:
ddf2 = data.frame(with(df, table(Date, IsWin)))
ggplot(ddf2)+
geom_bar(aes(x=Date, y=Freq, fill=IsWin), stat='identity', position='dodge')+
theme(axis.text.x=element_text(angle=45, size=10, hjust=1, vjust=1))
编辑: 对于负面栏:
ddf2$new = ifelse(ddf2$IsWin=='True', 1,-1)
ggplot(ddf2)+
geom_bar(data=ddf2[ddf2$new>0,], aes(x=Date, y=Freq*new, fill=IsWin), stat='identity')+
geom_bar(data=ddf2[ddf2$new<0,], aes(x=Date, y=Freq*new, fill=IsWin), stat='identity')+
theme(axis.text.x=element_text(angle=45, size=10, hjust=1, vjust=1))
答案 1 :(得分:1)
这个怎么样?您在dplyr包中使用group_by()
。您可以按以下方式对数据进行分组。您可以汇总(计算)每个日期存在多少TRUE和FALSE。使用此数据框,您可以创建堆积条形图。
library(dplyr)
library(ggplot2)
### Create a sample data set
dates <- rep(c("2014-08-01", "2014-08-02"), each = 10, times = 1)
win <- rep(c("TRUE", "FALSE", "FALSE", "TRUE", "TRUE"), each = 1, times = 4)
foo <- data.frame(cbind(dates, win))
foo$dates <- as.character(foo$dates)
ana <- foo %>%
group_by(dates, win) %>%
summarize(count = n())
# ana
# Source: local data frame [4 x 3]
# Groups: date
# dates win count
# 1 2014-08-01 FALSE 4
# 2 2014-08-01 TRUE 6
# 3 2014-08-02 FALSE 4
# 4 2014-08-02 TRUE 6
bob <- ggplot(ana, aes(x=dates, y=count, fill=win)) +
geom_bar(stat="identity") +
scale_y_continuous(breaks = seq(0,10,by = 1))
更新选项
看到评论后,我提出了这个想法。它有两个新的东西。一种是当胜利条件为假时将正值转换为负值。另一个是新的ggplot。我相信有更好的办法。但是,我想在这里提出这个想法。
ana <- foo %>%
group_by(dates, win) %>%
summarize(count = n())
# If there is FALSE in ith row in the win column, make the value of ith row in the
# count column negative. If you can avoid a loop and achieve the same goal, that
# may be the best option. But, I do not have any ideas in my mind yet.
for(i in 1:nrow(ana)){
if(ana$win[[i]] == "FALSE"){
ana$count[[i]] <- -abs(ana$count[[i]])
}
}
bob <- ggplot(data=ana, aes(x=dates, y=count, fill=win)) +
geom_bar(stat="identity", position=position_dodge())
这是否符合您的要求?