Question

我的目标是使用我的.csv数据创建一个条形图，以查看按性别分组的工作年（x）和工资（y）之间的关系（单独的条形图）。

首先，我想将变量'workyear'分为三组：（1）超过10年，（2）仅10年和（3）不到10年然后我想用性别（1 =女性，0 =男性）

创建条形图

我的部分数据如下：

...    workyear gender wage 
513         12    0  15.00
514         16    0  12.67
515         14    1   7.38
516         16    0  15.56
517         12    1   7.45
518         14    1   6.25
519         16    1   6.25
520         17    0   9.37
....

为此，我尝试将变量的值替换为三组，并使用了matplotlib。

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

#load data 
df=pd.DataFrame.from_csv('data.csv', index_col=None)
print(df)
df.sort_Values("workyear", ascending=True, inplace=True)

#parameters
bar_width = 0.2

#replacing Education year -> Education level grouped by given criteria.
#But I got an error.
df.loc[df.workyear<10, 'workyear'] = 'G1'
df.loc[df.workyear==10, 'workyear'] = 'G2'
df.loc[df.workyear>10, 'workyear']='G3'

#plotting
plt.bar(x, df.education[df.gender==1], bar_width, yerr=df.wage,color='y', label='female')
plt.bar(x+bar_width, df.education[df.gender==0], bar_width, yerr=df.wage, color='c', label='male')

我希望看到这样的条形图（请将'+'视为条形码）：

y=wage|                 + +
      | +        +      + +
      | +      + +      + +
      | + +    + +      + +
      |_______________________ x=work year (3-group)
        >10     10       10<

但这就是我实际得到的......（是的。所有错误）

Traceback (most recent call last):
File "data.py", line 21, in <module>
df.loc[df.workyear>10, 'workyear']='G3'
in wrapper
res = na_op(values, other)
in na_op
result = _comp_method_OBJECT_ARRAY(op, x, y)
in _comp_method_OBJECT_ARRAY
result = lib.scalar_compare(x, y, op)
File "pandas\_libs\lib.pyx", line 769, in pandas._libs.lib.scalar_compare (pandas\_libs\lib.c:13717) 
TypeError: unorderable types: str() > int()

你能告诉我一下吗？

Answer 1

更直接的方式：

df['Age']=pd.cut(df.workyear,[1,13,14,100])
df['Gender']=df.gender.map({0:'male',1:'female'})
df.pivot_table(values='wage',index='Age',columns='Gender').plot.bar()

for：

Python：如何使用matplotlib在python中绘制条形图？

1 个答案: