我正在尝试在python中使用pandas来绘制以下高维数据: http://i.stack.imgur.com/34nbR.jpg
这是我的代码:
import pandas
from pandas.tools.plotting import parallel_coordinates
data = pandas.read_csv('ParaCoords.csv')
parallel_coordinates(data,'Name')
代码无法绘制数据,Traceback错误以:
结束Keyerror: 'Name'
parallel_coordinates应该说/做的第二个参数是什么? 如何成功绘制数据?
答案 0 :(得分:1)
第二个参数应该是定义class
的列名。想想['dog', 'dog', 'cat', 'bird', 'cat', 'dog']
。
在example online中,他们使用'Name'
作为第二个参数,因为这是一个定义虹膜名称的列
Signature: parallel_coordinates(*args, **kwargs) Docstring: Parallel coordinates plotting. Parameters ---------- frame: DataFrame class_column: str Column name containing class names cols: list, optional A list of column names to use ax: matplotlib.axis, optional matplotlib axis object color: list or tuple, optional Colors to use for the different classes use_columns: bool, optional If true, columns will be used as xticks xticks: list or tuple, optional A list of values to use for xticks colormap: str or matplotlib colormap, default None Colormap to use for line colors. axvlines: bool, optional If true, vertical lines will be added at each xtick axvlines_kwds: keywords, optional Options to be passed to axvline method for vertical lines kwds: keywords Options to pass to matplotlib plotting method
答案 1 :(得分:0)
您download from UCI没有标头的iris.data文件。要使pandas示例有效,您必须将标题明确指定为列名:
from pandas.tools.plotting import parallel_coordinates
# The iris.data file from UCI does not have headers,
# so we have to assign the column names explicitly.
data = pd.read_csv("data-iris-for-pandas/iris.data")
data.columns=["x1","x2","x3","x4","Name"]
plt.figure()
parallel_coordinates(data,"Name")
基本上,pandas文档不完整。有人在没有让我们知道的情况下将列名放入数据框中。