Matplotlib中按列值的颜色

时间:2013-02-14 23:23:12

标签: python pandas matplotlib

在R中使用ggplot2库最喜欢的一个方面是能够轻松指定美学。我可以快速制作散点图并应用与特定列关联的颜色,我希望能够使用python / pandas / matplotlib来完成此操作。我想知道是否有人使用pandas dataframes和Matplotlib将颜色映射到值的便利函数?

##ggplot scatterplot example with R dataframe, `df`, colored by col3
ggplot(data = df, aes(x=col1, y=col2, color=col3)) + geom_point()

##ideal situation with pandas dataframe, 'df', where colors are chosen by col3
df.plot(x=col1,y=col2,color=col3)

编辑: 感谢您的回复,但我想提供一个示例数据框来澄清我的要求。两列包含数字数据,第三列是分类变量。我正在考虑的脚本将根据此值指定颜色。

import pandas as pd
df = pd.DataFrame({'Height':np.random.normal(10),
                   'Weight':np.random.normal(10),
                   'Gender': ["Male","Male","Male","Male","Male",
                              "Female","Female","Female","Female","Female"]})

4 个答案:

答案 0 :(得分:44)

2015年10月更新

Seaborn精彩地处理这个用例:

import numpy 
import pandas
from  matplotlib import pyplot
import seaborn
seaborn.set(style='ticks')

numpy.random.seed(0)
N = 37
_genders= ['Female', 'Male', 'Non-binary', 'No Response']
df = pandas.DataFrame({
    'Height (cm)': numpy.random.uniform(low=130, high=200, size=N),
    'Weight (kg)': numpy.random.uniform(low=30, high=100, size=N),
    'Gender': numpy.random.choice(_genders, size=N)
})

fg = seaborn.FacetGrid(data=df, hue='Gender', hue_order=_genders, aspect=1.61)
fg.map(pyplot.scatter, 'Weight (kg)', 'Height (cm)').add_legend()

立即输出:

enter image description here

旧答案

在这种情况下,我会直接使用matplotlib。

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

def dfScatter(df, xcol='Height', ycol='Weight', catcol='Gender'):
    fig, ax = plt.subplots()
    categories = np.unique(df[catcol])
    colors = np.linspace(0, 1, len(categories))
    colordict = dict(zip(categories, colors))  

    df["Color"] = df[catcol].apply(lambda x: colordict[x])
    ax.scatter(df[xcol], df[ycol], c=df.Color)
    return fig

if 1:
    df = pd.DataFrame({'Height':np.random.normal(size=10),
                       'Weight':np.random.normal(size=10),
                       'Gender': ["Male","Male","Unknown","Male","Male",
                                  "Female","Did not respond","Unknown","Female","Female"]})    
    fig = dfScatter(df)
    fig.savefig('fig1.png')

这让我:

scalle plot with categorized colors 据我所知,该颜色列可以是任何matplotlib兼容颜色(RBGA元组,HTML名称,十六进制值等)。

我无法获得除数字地图以外的任何数字值。

答案 1 :(得分:7)

实际上你可以使用ggplot for python

from ggplot import *
import numpy as np
import pandas as pd

df = pd.DataFrame({'Height':np.random.randn(10),
                   'Weight':np.random.randn(10),
                   'Gender': ["Male","Male","Male","Male","Male",
                              "Female","Female","Female","Female","Female"]})


ggplot(aes(x='Height', y='Weight', color='Gender'), data=df)  + geom_point()

ggplot in python

答案 2 :(得分:4)

您可以使用 颜色 参数绘图方法来定义每列所需的颜色。例如:

from pandas import DataFrame
data = DataFrame({'a':range(5),'b':range(1,6),'c':range(2,7)})
colors = ['yellowgreen','cyan','magenta']
data.plot(color=colors)

Three lines with custom colors

你可以使用颜色名称或颜色十六进制代码,例如'#000000'代表黑色说法。您可以在matplotlib的color.py文件中找到所有已定义的颜色名称。下面是matplotlib的github repo中的color.py文件的链接。

https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/colors.py

答案 3 :(得分:4)