"伪造"轴刻度和Matplotlib标签

时间:2015-07-24 03:07:59

标签: python matplotlib

我正在尝试创建一个图形,其中多维数据集的每个维度都在子图网格中相互绘制。以下是我到目前为止的情况:

enter image description here

x维度由子图列确定,y维度由行确定。当尺寸相等时,绘制具有y轴密度的1-d直方图,否则使用具有映射到颜色的密度的2d直方图。在创建每个子图时,我与该列中的第一个图共享x轴(使用sharex函数中的Figure.add_subplot参数)。除了1d直方图外,Y轴的分享方式相似。

这样可以使轴保持相同的比例,但是您可以在左上方看到问题。由于大多数轴在行和列中是相同的,因此在图的底部和左侧部分仅有刻度线。问题是左上方的子图具有与其行的其余部分不同的y比例。

我想实际上对行上其他子图的y轴进行刻度,应用于左上角的子图,不用更改该子图的限制。从行中的第二个子图获得y标签并在第一个作品中设置它们,但实际上改变刻度的位置并不是因为轴的极限不相同。除了明确地将点从一个绘图的比例转换到另一个绘图的比例之外,我无法确定如何相对设置刻度位置。

编辑:因为有人问,这里是用于生成此代码的基本版本代码:

import numpy as np
from scipy.stats import gaussian_kde

def matrix_plot(figure, data, limits, labels):
    """
    Args:
        figure: matplotlib Figure
        data: numpy.ndarray, points/observations in rows
        limits: list of (min, max) values for axis limits
        labels: list of labels for each dimension
    """

    # Number of dimensions (data columns)
    ndim = data.shape[1]

    # Create KDE objects
    density = [ gaussian_kde(data[:,dim]) for dim in range(ndim) ]

    # Keep track of subplots
    plots = np.ndarray((ndim, ndim), dtype=object)

    # Loop through dimensions twice
    # dim1 goes by column
    for dim1 in range(ndim):
        # dim2 goes by row
        for dim2 in range(ndim):

            # Index of plot
            i = dim2 * ndim + dim1 + 1

            # Share x-axis with plot at top of column
            # Share y-axis with plot at beginning of row, unless that
            #    plot or current plot is a 1d plot
            kwargs = dict()
            if dim2 > 0:
                kwargs['sharex'] = plots[0][dim1]
                if dim1 > 0 and dim1 != dim2:
                    kwargs['sharey'] = plots[dim2][0]
            elif dim1 > 1:
                kwargs['sharey'] = plots[dim2][1]

            # Create new subplot
            # Pass in shared axis arguments with **kwargs
            plot = figure.add_subplot(ndim, ndim, i, **kwargs)
            plots[dim2][dim1] = plot

            # 1d density plot
            if dim1 == dim2:

                # Space to plot over
                x = np.linspace(limits[dim][0], limits[dim][1], 100)

                # Plot filled region
                plot.set_xlim(limits[dim])
                plot.fill_between(x, density[dim].evaluate(x))

            # 2d density plot
            else:

                # Make histogram
                h, xedges, yedges = np.histogram2d(data[:,dim1],
                    data[:,dim2], range=[limits[dim1], limits[dim2]],
                    bins=250)

                # Set zero bins to NaN to make empty regions of
                #   plot transparent
                h[h == 0] = np.nan

                # Plot without grid
                plot.imshow(h.T, origin='lower',
                    extent=np.concatenate((limits[dim1], limits[dim2])),
                    aspect='auto')
                plot.grid(False)

            # Ticks and labels of except on figure edges
            plot.tick_params(axis='both', which='both', left='off',
                right='off', bottom='off', top='off', labelleft='off',
                labelbottom='off')
            if dim1 == 0:
                plot.tick_params(axis='y', left='on', labelleft='on')
                plot.set_ylabel(labels[dim2])
            if dim2 == self._ndim - 1:
                plot.tick_params(axis='x', bottom='on', labelbottom='on')
                plot.set_xlabel(labels[dim1])

        # Tight layout
        figure.tight_layout(pad=.1, h_pad=0, w_pad=0)

当我尝试将第一行中第二个图的y轴上的刻度位置和标签复制到第一个图时,我得到的是:

plots[0][0].set_yticks(plots[0][1].get_yticks())
plots[0][0].set_yticklabels(plots[0][1].get_yticklabels())

enter image description here

注意它如何在绝对标度上指定刻度位置,该绝对标度远高于密度图的标度。轴限制扩展以显示刻度,因此实际密度曲线被压缩到底部。此外,标签不会显示。

1 个答案:

答案 0 :(得分:1)

感谢Ajean的评论,告诉我scatter_matrix包中的pandas功能,这或多或少地影响了我在这里尝试做的事情。我在GitHub上查看了源代码,找到了他们修复的部分"左上图中的轴对应于行的共享y轴而不是密度轴:

if len(df.columns) > 1:
    lim1 = boundaries_list[0]
    locs = axes[0][1].yaxis.get_majorticklocs()
    locs = locs[(lim1[0] <= locs) & (locs <= lim1[1])]
    adj = (locs - lim1[0]) / (lim1[1] - lim1[0])

    lim0 = axes[0][0].get_ylim()
    adj = adj * (lim0[1] - lim0[0]) + lim0[0]
    axes[0][0].yaxis.set_ticks(adj)

    if np.all(locs == locs.astype(int)):
        # if all ticks are int
        locs = locs.astype(int)
    axes[0][0].yaxis.set_ticklabels(locs)

不幸的是,它看起来像我害怕的:除了手动将刻度位置从一个范围转换到另一个范围之外,没有更优雅的方法。这是我的版本,它紧跟在双循环之后:

# Check there are more plots in the row, just in case
if ndim > 1:
    # Get tick locations from 2nd plot in first row
    ticks = np.asarray(plots[0][1].yaxis.get_majorticklocs())

    # Throw out the ones that aren't within the limit
    # (Copied from pandas code, but probably not necessary)
    ticks = ticks[(ticks >= limits[0][0]) & (ticks <= limits[0][1])]

    # Scale ticks to range of [0, 1] (relative to axis limits)
    ticks_scaled = (ticks - limits[0][0]) / (limits[0][1] - limits[0][0])

    # Y limits of top-left density plot (was automatically determined
    #       by matplotlib)
    dlim = plots[0][0].get_ylim()

    # Set the ticks scaled to the plot's own y-axis
    plots[0][0].set_yticks((ticks_scaled * (dlim[1] - dlim[0])) + dlim[0])

    # Set tick labels to their original positions on the 2d plot
    plots[0][0].set_yticklabels(ticks)

这得到了我正在寻找的结果。