我有:
$ Facebook : int 0 0 0 0 0 0 0 1 1 0 ...
$ YouTube : int 0 0 0 0 0 0 0 0 0 0 ...
...
$ Subscribed : int 0 0 0 1 0 0 0 0 1 0 ...
如果我使用sourceTotals = colSums(sources[c(2:13)])
,我会生成一个包含每项服务总计的数据框。
我怎样才能产生以下内容:
Service Subscribed NotSubscribed Total
Facebook 50 50 100
YouTube 10 235 245
(我将尝试根据新数据框生成堆积条形图)
我尝试过诸如sourceTotals = data.frame(colSums(ifelse((sources[c(2:13)]+sources[15])>2, 1, 0)), colSums(ifelse((sources[c(2:13)]+sources[15])>2, 1, 0)), colSums(sources[c(2:13)]))
之类的东西,但似乎并不是这样。
谢谢!
基于Karolis回答的解决方案:
myData = sources[c(2:14)]
sL = rep("none", nrow(myData))
for(i in 1:ncol(myData)-1) {
sL[as.logical(myData[,i])] <- colnames(myData)[i]
}
sM = addmargins(table(sL, Subscribed=as.logical(myData$Subscribed)), 2)
library(reshape2)
sM1 = melt(sM[,c(1:2)])
colnames(sM1) = c("Source", "Subscribed", "Users")
ggplot(sM1, aes(x=Source, y=Users, fill=Subscribed)) + geom_bar(stat="identity") + theme(axis.text.x = element_text(angle = 90, hjust = 1))
答案 0 :(得分:1)
如果您将数据转换为更方便的形式,应该很容易。您可以将组的所有0/1替换为单个因子:
Groups <- rep("none", nrow(myData))
for(i in 1:ncol(myData)-1) {
Groups[as.logical(myData[,i])] <- colnames(myData)[i]
}
Groups
[1] "Google" "Google" "Google" "Google" "Chrome.Store" "Google"
之后就这么简单了:
addmargins(table(Groups, Subscribed=as.logical(myData$Subscribed)), 2)
Subscribed
Groups FALSE TRUE Sum
Chrome.Store 1 0 1
Google 4 1 5