我有像这样的pandas数据框:
LEVEL_1 LEVEL_2 Freq Percentage
0 HIGH HIGH 8842 17.684
1 AVERAGE LOW 2802 5.604
2 LOW LOW 22198 44.396
3 AVERAGE AVERAGE 6804 13.608
4 LOW AVERAGE 2030 4.060
5 HIGH AVERAGE 3666 7.332
6 AVERAGE HIGH 2887 5.774
7 LOW HIGH 771 1.542
我可以获得LEVEL_1和LEVEL_2:
的图块 from statsmodels.graphics.mosaicplot import mosaic
mosaic(df, ['LEVEL_1','LEVEL_2'])
enter image description here
我只想把Freq和Percentage放在每块马赛克图块的中心。
我怎么能这样做?
答案 0 :(得分:2)
这是一个开始。注意我必须在DataFrame中添加一行零标记。您可以通过lambda
函数中的字符串格式使标签更好。您还要重新排序标题。
import pandas as pd
from statsmodels.graphics.mosaicplot import mosaic
import io
d = io.StringIO()
d.write(""" LEVEL_1 LEVEL_2 Freq Percentage\n
HIGH HIGH 8842 17.684\n
AVERAGE LOW 2802 5.604\n
LOW LOW 22198 44.396\n
AVERAGE AVERAGE 6804 13.608\n
LOW AVERAGE 2030 4.060\n
HIGH AVERAGE 3666 7.332\n
AVERAGE HIGH 2887 5.774\n
LOW HIGH 771 1.542""")
d.seek(0)
df = pd.read_csv(d, skipinitialspace=True, delim_whitespace=True)
df = df.append({'LEVEL_1': 'HIGH', 'LEVEL_2': 'LOW', 'Freq': 0, 'Percentage': 0}, ignore_index=True)
df = df.sort_values(['LEVEL_1', 'LEVEL_2'])
df = df.set_index(['LEVEL_1', 'LEVEL_2'])
print(df)
mosaic(df['Freq'], labelizer=lambda k: df.loc[k].values);