我想为一个索引级别制作一个堆积条形图,而另一个仍然是未堆叠的。下面的代码为每个索引行创建元组:
from pandas import DataFrame, MultiIndex
from numpy import repeat
from numpy.random import randn
arrays = [repeat('a b'.split(),2),[True,False,True,False]]
midx = MultiIndex.from_tuples(zip(*arrays), names=['letters','bool'])
df = DataFrame(randn(4,2)**2+5, index=midx)
df.plot(kind='bar', stacked=True)
plt.legend(loc="center right", bbox_to_anchor=(1.5, 0.5), ncol=2)
但我更愿意看到(0,1)并排分组,就像使用这个R代码一样(在IPython中):
%load_ext rmagic
dr = df.stack().reset_index()
然后
%%R -i dr
library(ggplot2)
names(dr) <- c('letters','bool','n','value')
x <- ggplot() +
geom_bar(data=dr, aes(y = value, x = letters, fill = bool),
stat="identity", position='stack') +
theme_bw() +
facet_grid( ~ n)
print(x)
现在:有没有办法在 pandas 中执行此操作,如果我折磨 matplotlib ,我应该安装ggplot for python还是应该运行 ggplot2 (正如我刚才所做的那样)?我无法获得 rpy2 的ggplot类
from rpy2.robjects.lib import ggplot2
使用我的布局(还)。
答案 0 :(得分:1)
如果你有R代码,可以逐步移植到rpy2
import rpy2.robjects as ro
ro.globalenv['dr'] = dr
ro.r("""
library(ggplot2)
names(dr) <- c('letters','bool','n','value')
x <- ggplot() +
geom_bar(data=dr, aes(y = value, x = letters, fill = bool),
stat="identity", position='stack') +
theme_bw() +
facet_grid( ~ n)
print(x)
""")
这样做的缺点是使用了R的GlobalEnv。功能可以更优雅。
make_plot = ro.r("""
function(dr) {
names(dr) <- c('letters','bool','n','value')
x <- ggplot() +
geom_bar(data=dr, aes(y = value, x = letters, fill = bool),
stat="identity", position='stack') +
theme_bw() +
facet_grid( ~ n)
print(x)
}""")
make_plot(dr)
另一种方法是在rpy2中使用ggplot2映射,并在不写入的情况下写入 R代码:
from rpy2.robjects import Formula
from rpy2.robjects.lib.ggplot2 import ggplot, geom_bar, aes_string, theme_bw, facet_grid
## oddity with names in the examples, that can either be corrected in the Python-pandas
## structure or with an explicit conversion into an R object and renaming there
drr = rpy2.robjects.pandas2ri.pandas2ri(dr)
drr.names[2] = 'n'
drr.names[3] = 'value'
p = ggplot(drr) + \
geom_bar(aes_string(x="letters", y="value", fill="bool"),
stat="identity", position="stack") + \
theme_bw() + \
facet_grid(Formula('~ n'))
p.plot()