Question

在ggplot中做facet我经常喜欢使用百分比而不是计数。

e.g。

test1 <- sample(letters[1:2], 100, replace=T)
test2 <- sample(letters[3:8], 100, replace=T)
test <- data.frame(cbind(test1,test2))
ggplot(test, aes(test2))+geom_bar()+facet_grid(~test1)

这很容易，但是如果N在方面A上与方面B不同，那么我认为比较百分比会更好，每个方面的总和达到100％。

你将如何实现这一目标？

希望我的问题有道理。

此致。

Answer 1

以下是使用ggplot和..count..的{{1}}内部方法：

..PANEL..

由于这是动态计算的，因此对绘图参数的更改应该是健壮的。

Answer 2

试试这个：

# first make a dataframe with frequencies
df <- as.data.frame(with(test, table(test1,test2)))
# or with count() from plyr package as Hadley suggested
df <- count(test, vars=c('test1', 'test2'))
# next: compute percentages per group
df <- ddply(df, .(test1), transform, p = Freq/sum(Freq))
# and plot
ggplot(df, aes(test2, p))+geom_bar()+facet_grid(~test1)

alt text

您还可以为 ggplot2 版本0.8.9添加+ scale_y_continuous(formatter = "percent")，或者为版本0.9.0添加+ scale_y_continuous(labels = percent_format())。

Answer 3

一种非常简单的方法：

ggplot(test, aes(test2)) + 
    geom_bar(aes(y = (..count..)/sum(..count..))) + 
    facet_grid(~test1)

所以我只将geom_bar的参数更改为aes(y = (..count..)/sum(..count..))。将ylab设置为NULL并指定格式化程序后，您可以得到：

ggplot(test, aes(test2)) +
    geom_bar(aes(y = (..count..)/sum(..count..))) + 
    facet_grid(~test1) +
    scale_y_continuous('', formatter="percent")

<强>更新请注意，虽然formatter = "percent")适用于 ggplot2 版本0.8.9，但在0.9.0中您需要scale_y_continuous(labels = percent_format())之类的内容。 alt text

Answer 4

这是一个可以让您朝着正确的方向前进的解决方案。我很想知道是否有更有效的方法可以做到这一点，因为这看起来有点hacky和令人费解。我们可以使用..density..的内置y aesthetic参数，但因子在那里不起作用。因此，一旦我们将scale_x_discrete转换为数字对象，我们还需要使用test2来适当地标记轴。

ggplot(data = test, aes(x = as.numeric(test2)))+ 
geom_bar(aes(y = ..density..), binwidth = .5)+ 
scale_x_discrete(limits = sort(unique(test$test2))) + 
facet_grid(~test1) + xlab("Test 2") + ylab("Density")

但请给它一个旋转，让我知道你的想法。

此外，您可以像这样缩短测试数据的创建时间，从而避免环境中的额外对象，并且必须将它们组合在一起：

test <- data.frame(
    test1 = sample(letters[1:2], 100, replace = TRUE), 
    test2 = sample(letters[3:8], 100, replace = TRUE)
)

Answer 5

我经常处理类似情况，但采用一种非常不同的方法，使用Hadley的其他两个包，即reshape和plyr。主要是因为我倾向于将事物视为100％堆积条（当它们总计为100％时）。

test <- data.frame(sample(letters[1:2], 100, replace=T), sample(letters[3:8], 100, replace=T))
colnames(test) <- c("variable","value")
test <- cast(test, variable + value ~ .) 
colnames(test)[3] <- "frequ"

test <- ddply(test,"variable", function(x) {
    x <- x[order(x$value),]
    x$cfreq <- cumsum(x$frequ)/sum(x$frequ)
    x$pos <- (c(0,x$cfreq[-nrow(x)])+x$cfreq)/2
    x$freq <- (x$frequ)/sum(x$frequ)
    x
})

plot.tmp <- ggplot(test, aes(variable,frequ, fill=value)) + geom_bar(stat="identity", position="fill") + coord_flip() + scale_y_continuous("", formatter="percent")

Answer 6

感谢您分享PANEL＆＃34;提示＆＃34;在ggplot方法上。

有关信息：您可以使用y lab方法中的count和group在同一条形图上的ggplot中生成百分比：

ggplot(test, aes(test2,fill=test1))
   + geom_bar(aes(y = (..count..)/tapply(..count..,..group..,sum)[..group..]), position="dodge")
   + scale_y_continuous(labels = percent)

在一个刻面的ggplot条形图中y实验室的百分比？

6 个答案: