在pandas数据帧中的两个字段之间绘制线图

时间:2017-12-15 22:00:20

标签: python pandas matplotlib

我正在尝试使用pandas数据框绘制图形。代码在

之下
import pandas as pd
import numpy as np
from IPython.display import display

movies = pd.read_csv('data/movie.csv')
director = movies['director_name']

director.to_frame().head()

director_name
0   James Cameron
1   Gore Verbinski
2   Sam Mendes
3   Christopher Nolan
4   Doug Walker

director.value_counts()

Steven Spielberg    26
Woody Allen         22
Clint Eastwood      20
Martin Scorsese     20
                    ..
James Nunn           1
Gerard Johnstone     1
Ethan Maniquis       1
Antony Hoffman       1
Name: director_name, Length: 2397, dtype: int64

我想在director_namedirector value counts之间绘制一个折线图。

import matplotlib.pyplot as plt

%matplotlib inline

df_list = list(director)
# print(df_list)

x = df_list
y = list(director.value_counts())

plt.figure(figsize=(15,3))
plt.plot(x, y)
plt.ylim(0, 100)
plt.xlabel('X Axis')
plt.ylabel('Y axis')
plt.title('Line Plot')
plt.suptitle('Figure Title', size=20, y=1.03)

我收到以下错误。我做错了什么?

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-70-186ebd6b22d9> in <module>()
      8 
      9 plt.figure(figsize=(15,3))
---> 10 plt.plot(x, y)
     11 #plt.xlim(0, 10)
     12 plt.ylim(0, 100)

~/anaconda/lib/python3.6/site-packages/matplotlib/pyplot.py in plot(*args, **kwargs)
   3238                       mplDeprecation)
   3239     try:
-> 3240         ret = ax.plot(*args, **kwargs)
   3241     finally:
   3242         ax._hold = washold

~/anaconda/lib/python3.6/site-packages/matplotlib/__init__.py in inner(ax, *args, **kwargs)
   1708                     warnings.warn(msg % (label_namer, func.__name__),
   1709                                   RuntimeWarning, stacklevel=2)
-> 1710             return func(ax, *args, **kwargs)
   1711         pre_doc = inner.__doc__
   1712         if pre_doc is None:

~/anaconda/lib/python3.6/site-packages/matplotlib/axes/_axes.py in plot(self, *args, **kwargs)
   1435         kwargs = cbook.normalize_kwargs(kwargs, _alias_map)
   1436 
-> 1437         for line in self._get_lines(*args, **kwargs):
   1438             self.add_line(line)
   1439             lines.append(line)

~/anaconda/lib/python3.6/site-packages/matplotlib/axes/_base.py in _grab_next_args(self, *args, **kwargs)
    402                 this += args[0],
    403                 args = args[1:]
--> 404             for seg in self._plot_args(this, kwargs):
    405                 yield seg
    406 

~/anaconda/lib/python3.6/site-packages/matplotlib/axes/_base.py in _plot_args(self, tup, kwargs)
    382             x, y = index_of(tup[-1])
    383 
--> 384         x, y = self._xy_from_xy(x, y)
    385 
    386         if self.command == 'plot':

~/anaconda/lib/python3.6/site-packages/matplotlib/axes/_base.py in _xy_from_xy(self, x, y)
    241         if x.shape[0] != y.shape[0]:
    242             raise ValueError("x and y must have same first dimension, but "
--> 243                              "have shapes {} and {}".format(x.shape, y.shape))
    244         if x.ndim > 2 or y.ndim > 2:
    245             raise ValueError("x and y can be no greater than 2-D, but have "

ValueError: x and y must have same first dimension, but have shapes (4916,) and (2397,)

2 个答案:

答案 0 :(得分:1)

您的x = df_listy = list(director.value_counts())维度不同。您不需要x = df_list,因为y已包含您要查找的信息。

使用此:

labels = director.value_counts().index.values  // Use this for xtick labels
y = list(director.value_counts())
maxY = max(y);
x = range(len(y))

...
ax = plt.plot(x, y, '-', grid=True, color='blue')
ax.set_xticks(range(len(y)))
ax.set_xticklabels(labels)

答案 1 :(得分:0)

IIUC,您可以groupby()count()plot()。您不需要value_counts()

例如,样本数据框director为:

print(director)
        director_name
0       James Cameron
1      Gore Verbinski
2          Sam Mendes
3   Christopher Nolan
4         Doug Walker
5       James Cameron
6          Sam Mendes
7          Sam Mendes

使用:

director.groupby("director_name").director_name.count().plot()

enter image description here