我有一个数据框,其中包含“季度”和“转售价”列。我使用数据框使用seaborn绘制箱形图。箱线图显示季度值,例如(2007-Q2、2007-Q3、2007-Q4、2008-Q2)。但是,我希望它显示年值,例如(2007,2008,2009)。我该如何实现?
import seaborn as sns
data = {
'quarter': ['2007-Q2', '2007-Q2', '2007-Q2', '2007-Q2', '2007-Q3',
'2007-Q3', '2007-Q3', '2007-Q3', '2007-Q4', '2007-Q4',
'2007-Q4', '2007-Q4', '2008-Q2', '2008-Q2', '2008-Q2',
'2008-Q2','2008-Q3', '2008-Q3', '2008-Q3', '2008-Q3',
'2008-Q4', '2008-Q4', '2008-Q4', '2008-Q4', '2009-Q2',
'2009-Q2', '2009-Q2', '2009-Q2', '2009-Q3', '2009-Q3',
'2009-Q3', '2009-Q3', '2009-Q4', '2009-Q4', '2009-Q4',
'2009-Q4', '2010-Q2','2010-Q2', '2010-Q2', '2010-Q2',
'2010-Q3', '2010-Q3', '2010-Q3', '2010-Q3', '2010-Q4',
'2010-Q4', '2010-Q4', '2010-Q4'],
'resale_price': [172000, 260000, 372000, 172000, 224500, 224500,
311500, 358800, 438000, 344000, 182200, 261300, 372000,
172000, 224500, 224240, 311500, 358800, 438000, 344900,
172000, 260000, 372000, 172000, 224500, 224500, 311500,
358800, 438000, 394000, 172400, 360000, 172000, 472000,
254500, 226510, 321600, 358800, 438800, 394000, 155400,
465000, 232000, 475090, 244520, 236518, 321100, 398901]
}
df = pd.DataFrame(data)
plt.figure(figsize=(12,6))
ax = sns.boxplot(data = df, x='quarter', y='resale_price')
for item in ax.get_xticklabels():
item.set_rotation(90)
答案 0 :(得分:1)
使用set_xticklabels
进行索引-字符串的前4个值,也设置rotation
:
ax.set_xticklabels(df['quarter'].str[:4], rotation='vertical')
应该删除循环:
for item in ax.get_xticklabels():
item.set_rotation(90)
如果每年需要单独的箱线图:
df['year'] = df['quarter'].str[:4]
ax = sns.boxplot(data = df, x='year', y='resale_price')
答案 1 :(得分:1)
据我了解的问题,您只希望有4个盒子,每年一个。这可以通过从“季度”列中提取年份并使用新的“年份”列作为seaborn函数的输入来实现。
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
data = # ... as defined in question
df = pd.DataFrame(data)
df["year"], _ = df['quarter'].str.split('-', 1).str
plt.figure(figsize=(12,6))
ax = sns.boxplot(data = df, x='year', y='resale_price')
plt.show()