修改pca图以显示每个图的更少尺寸

时间:2018-01-26 17:59:35

标签: python matplotlib

我在下面的代码和示例数据中绘制了来自pca的特征权重,以便于查看每个维度的构成。我遇到的问题是它试图在一个图中拟合所有维度。我想在每个图中仅显示五个维度,或者在一行中有5个维度,然后在下面5个维度,直到我用完维度,以便于阅读。任何人都可以建议光滑的方式来修改代码,使其在每个绘图和循环中显示5个维度,直到我超出维度?

代码:

from sklearn.decomposition import PCA
pca = PCA().fit(log_data)

# Generate PCA results plot
pca_results = pca_results(log_data, pca)

功能:

###########################################
# Suppress matplotlib user warnings
# Necessary for newer version of matplotlib
import warnings
warnings.filterwarnings("ignore", category = UserWarning, module = "matplotlib")
#
# Display inline matplotlib plots with IPython
from IPython import get_ipython
get_ipython().run_line_magic('matplotlib', 'inline')
###########################################

import matplotlib.pyplot as plt
import matplotlib.cm as cm
import pandas as pd
import numpy as np

def pca_results(good_data, pca):
    '''
    Create a DataFrame of the PCA results
    Includes dimension feature weights and explained variance
    Visualizes the PCA results
    '''

    # Dimension indexing
    dimensions = dimensions = ['Dimension {}'.format(i) for i in range(1,len(pca.components_)+1)]

    # PCA components
    components = pd.DataFrame(np.round(pca.components_, 4), columns = good_data.keys())
    components.index = dimensions

    # PCA explained variance
    ratios = pca.explained_variance_ratio_.reshape(len(pca.components_), 1)
    variance_ratios = pd.DataFrame(np.round(ratios, 4), columns = ['Explained Variance'])
    variance_ratios.index = dimensions

    # Create a bar plot visualization
    fig, ax = plt.subplots(figsize = (14,8))

    # Plot the feature weights as a function of the components
    components.plot(ax = ax, kind = 'bar');
    ax.set_ylabel("Feature Weights")
    ax.set_xticklabels(dimensions, rotation=0)


    # Display the explained variance ratios
    for i, ev in enumerate(pca.explained_variance_ratio_):
        ax.text(i-0.40, ax.get_ylim()[1] + 0.05, "Explained Variance\n          %.4f"%(ev))

    # Return a concatenated DataFrame
    return pd.concat([variance_ratios, components], axis = 1)

数据:

print log_data.iloc[1:50]


    OrderCnt  Damaged_Spoiled  Item_Inc   missing     Other     P_rep  \
1   0.693147         0.000000  0.000000  0.000000  0.000000  0.000000   
5   1.098612         0.000000  0.000000  0.000000  0.000000  0.000000   
6   1.791759         0.000000  0.000000  0.000000  0.000000  0.000000   
7   2.564949         0.000000  0.000000  0.000000  0.000000  0.000000   
8   2.833213         0.000000  0.000000  0.000000  0.000000  0.000000   
9   3.433987         1.098612  0.000000  0.693147  0.000000  0.000000   
10  3.332205         1.098612  0.000000  0.000000  0.000000  0.000000   
11  3.091042         0.000000  0.000000  0.000000  0.000000  0.000000   
12  3.178054         0.000000  0.000000  0.000000  0.000000  0.000000   
13  3.332205         0.000000  0.000000  0.000000  0.000000  0.000000   
14  3.637586         0.000000  0.693147  0.693147  0.000000  0.000000   
15  3.610918         0.000000  0.000000  0.000000  0.000000  0.000000   
16  3.583519         0.000000  0.000000  0.000000  0.000000  0.000000   
17  3.433987         0.000000  0.000000  0.000000  0.000000  0.000000   
18  3.332205         0.000000  0.693147  0.000000  0.000000  0.000000   
19  2.890372         0.000000  0.000000  0.000000  0.000000  0.000000   
20  2.197225         0.000000  0.000000  0.000000  0.000000  0.000000   
21  1.098612         0.000000  0.000000  0.000000  0.000000  0.000000   
22  0.693147         0.000000  0.000000  0.000000  0.000000  0.000000   
28  0.693147         0.000000  0.000000  0.000000  0.000000  0.000000   
29  1.098612         0.000000  0.000000  0.000000  0.000000  0.000000   
30  1.945910         0.000000  0.000000  0.000000  0.000000  0.000000   
31  2.302585         0.000000  0.000000  0.000000  0.000000  0.000000   
32  2.995732         0.000000  0.000000  1.098612  0.000000  0.000000   
33  3.367296         0.693147  0.000000  0.693147  0.000000  0.000000   
34  3.332205         0.693147  0.000000  0.693147  0.000000  0.000000   
35  3.367296         0.693147  0.000000  0.000000  0.000000  0.000000   
36  3.135494         0.000000  0.000000  0.693147  0.000000  0.000000   
37  3.401197         0.000000  0.000000  0.000000  0.000000  0.000000   
38  3.737670         0.000000  0.000000  0.693147  0.000000  0.000000   
39  3.583519         0.693147  0.000000  0.000000  0.000000  0.000000   
40  3.784190         0.693147  0.000000  0.000000  0.000000  0.000000   
41  3.761200         0.693147  0.000000  1.098612  0.000000  0.000000   
42  3.332205         0.000000  0.000000  0.000000  0.000000  0.693147   
43  2.833213         0.000000  0.000000  0.000000  0.000000  0.000000   
44  2.397895         1.098612  0.000000  0.000000  0.000000  0.000000   
45  1.098612         0.000000  0.000000  0.000000  0.000000  0.000000   
46  0.693147         0.000000  0.000000  0.000000  0.000000  0.000000   
53  2.079442         0.000000  0.000000  0.000000  0.000000  0.000000   
54  2.079442         0.000000  0.000000  0.000000  0.000000  0.000000   
55  2.944439         0.000000  0.000000  0.000000  0.000000  0.000000   
56  3.465736         0.693147  0.000000  0.000000  0.000000  0.000000   
57  3.828641         1.386294  0.000000  1.098612  0.000000  0.693147   
58  3.688879         0.000000  0.000000  0.693147  0.000000  0.000000   
59  3.737670         0.000000  0.000000  1.386294  0.693147  0.693147   
60  3.737670         0.693147  0.000000  0.000000  0.000000  0.000000   
61  3.761200         0.000000  0.000000  0.000000  0.000000  0.000000   
62  3.761200         0.000000  0.000000  0.693147  0.000000  0.000000   
63  3.583519         0.000000  0.000000  0.000000  0.000000  0.000000   

      P_serv  Wrong_item       chi       nyc        sf    rate_0    rate_1  \
1   0.693147    0.000000  0.000000  0.000000  0.693147  0.000000  0.000000   
5   0.000000    0.000000  1.098612  0.000000  0.000000  0.000000  0.000000   
6   0.000000    0.000000  1.791759  0.000000  0.000000  0.000000  0.000000   
7   0.693147    1.386294  2.079442  0.693147  1.791759  0.000000  0.000000   
8   0.000000    0.000000  2.079442  1.386294  1.945910  0.000000  0.000000   
9   1.098612    1.386294  2.564949  0.693147  2.995732  0.000000  1.386294   
10  1.098612    0.000000  2.397895  1.098612  2.890372  0.000000  1.386294   
11  0.000000    0.000000  2.397895  1.791759  1.945910  0.000000  0.693147   
12  0.000000    0.693147  2.079442  1.609438  2.564949  0.000000  0.693147   
13  0.000000    0.000000  2.708050  1.098612  2.564949  0.000000  0.693147   
14  0.000000    0.693147  2.772589  1.945910  2.833213  0.000000  0.000000   
15  0.000000    0.000000  2.708050  2.484907  2.564949  0.693147  0.000000   
16  0.000000    0.693147  2.833213  1.791759  2.708050  0.693147  0.000000   
17  0.000000    0.693147  2.833213  1.791759  2.397895  0.000000  0.000000   
18  0.000000    1.098612  2.197225  1.098612  2.890372  0.000000  0.000000   
19  0.000000    0.000000  1.945910  1.098612  2.397895  0.000000  0.000000   
20  0.000000    0.000000  0.000000  0.000000  2.197225  0.000000  0.000000   
21  0.000000    0.000000  0.000000  0.000000  1.098612  0.000000  0.000000   
22  0.000000    0.000000  0.000000  0.000000  0.693147  0.000000  0.000000   
28  0.000000    0.000000  0.000000  0.693147  0.000000  0.000000  0.000000   
29  0.000000    0.000000  0.693147  0.693147  0.000000  0.000000  0.000000   
30  0.000000    0.000000  1.791759  0.000000  0.693147  0.000000  0.000000   
31  0.000000    0.000000  2.197225  1.386294  0.000000  0.000000  0.000000   
32  0.000000    0.000000  2.079442  1.098612  2.484907  0.000000  0.000000   
33  0.000000    0.693147  2.397895  0.693147  2.890372  0.000000  0.000000   
34  0.000000    1.098612  2.708050  1.098612  2.772589  0.000000  0.693147   
35  0.000000    1.386294  2.564949  1.098612  2.772589  0.000000  0.000000   
36  0.000000    0.000000  2.397895  1.609438  2.197225  0.000000  0.693147   
37  0.000000    1.098612  2.639057  0.693147  2.890372  0.000000  0.693147   
38  0.000000    0.000000  3.135494  1.386294  2.833213  0.000000  0.693147   
39  0.000000    0.000000  2.890372  1.791759  2.639057  0.000000  0.000000   
40  0.000000    1.098612  3.295837  1.386294  2.772589  1.386294  0.000000   
41  0.000000    1.098612  2.944439  1.098612  3.218876  0.000000  1.098612   
42  0.693147    0.693147  2.302585  1.098612  2.944439  0.000000  0.000000   
43  0.000000    0.000000  1.098612  0.000000  2.772589  0.000000  0.000000   
44  1.098612    0.000000  0.000000  0.000000  2.397895  0.000000  0.000000   
45  0.000000    0.000000  0.000000  0.000000  1.098612  0.000000  0.000000   
46  0.000000    0.000000  0.000000  0.000000  0.693147  0.000000  0.000000   
53  0.000000    0.000000  1.609438  1.098612  0.693147  0.000000  0.000000   
54  0.000000    0.000000  1.945910  0.693147  0.693147  0.000000  0.000000   
55  0.000000    0.000000  2.564949  1.791759  0.693147  0.693147  0.000000   
56  0.000000    0.000000  2.564949  2.079442  2.564949  0.000000  0.000000   
57  0.693147    0.693147  2.995732  2.079442  3.258097  0.693147  0.000000   
58  0.000000    0.000000  3.218876  1.386294  2.708050  0.000000  0.000000   
59  0.000000    0.693147  3.135494  1.386294  2.995732  0.000000  0.693147   
60  0.000000    1.098612  2.833213  1.098612  3.178054  0.000000  0.000000   
61  0.000000    1.098612  3.044522  1.791759  3.044522  0.000000  0.000000   
62  0.693147    0.000000  3.044522  1.098612  3.135494  0.000000  0.000000   
63  0.693147    0.000000  2.833213  1.098612  2.890372  0.000000  0.000000   

      rate_2    rate_3    rate_4    rate_5  
1   0.000000  0.693147  0.000000  0.000000  
5   0.000000  0.000000  0.000000  1.098612  
6   0.000000  0.000000  0.000000  1.791759  
7   1.098612  0.000000  1.098612  2.302585  
8   0.693147  0.693147  1.098612  2.564949  
9   1.386294  0.693147  1.609438  3.091042  
10  0.000000  0.693147  1.386294  3.135494  
11  0.000000  0.693147  1.098612  2.890372  
12  0.000000  0.000000  1.609438  2.944439  
13  0.693147  0.000000  0.693147  3.258097  
14  0.000000  1.098612  1.098612  3.526361  
15  0.693147  0.693147  1.386294  3.465736  
16  0.693147  1.386294  0.693147  3.401197  
17  0.000000  0.693147  1.791759  3.258097  
18  0.693147  1.098612  1.386294  3.091042  
19  0.693147  0.000000  0.693147  2.833213  
20  0.000000  0.000000  0.693147  2.079442  
21  0.000000  0.000000  0.000000  1.098612  
22  0.000000  0.000000  0.693147  0.000000  
28  0.000000  0.000000  0.693147  0.000000  
29  0.000000  0.000000  0.000000  1.098612  
30  0.000000  0.000000  0.693147  1.791759  
31  0.000000  0.000000  0.000000  2.484907  
32  0.693147  0.693147  1.609438  2.708050  
33  0.000000  1.098612  2.079442  2.995732  
34  0.000000  0.000000  2.079442  3.178054  
35  0.693147  0.693147  1.945910  3.091042  
36  0.000000  0.693147  0.000000  3.044522  
37  1.098612  0.000000  1.386294  3.258097  
38  0.693147  1.098612  1.098612  3.583519  
39  0.000000  0.000000  1.791759  3.433987  
40  0.000000  1.098612  2.079442  3.496508  
41  0.693147  1.098612  2.079442  3.496508  
42  0.693147  0.000000  0.693147  3.332205  
43  0.000000  0.693147  1.098612  2.708050  
44  0.693147  0.000000  1.609438  1.791759  
45  0.000000  0.000000  0.693147  0.693147  
46  0.000000  0.000000  0.000000  0.693147  
53  0.000000  0.693147  0.693147  1.791759  
54  0.000000  0.000000  1.098612  1.945910  
55  0.000000  0.693147  1.098612  2.708050  
56  0.000000  1.098612  2.079442  3.135494  
57  0.000000  1.791759  2.197225  3.637586  
58  0.000000  1.098612  1.791759  3.555348  
59  0.693147  1.945910  1.791759  3.465736  
60  0.000000  0.693147  1.945910  3.555348  
61  0.693147  1.098612  1.609438  3.663562  
62  0.000000  1.386294  1.386294  3.663562  
63  0.693147  0.000000  1.609438  3.433987  

1 个答案:

答案 0 :(得分:1)

尝试使用以下代码替换您的绘图代码:

# Configure the number of dims to show per subplot
dims_per_plot = dpp = 5

# Prepare plot with appropriate number of subplots
# Note: see [1]
plot_rows = -(-len(dimensions) // dims_per_plot)
fig, axes = plt.subplots(plot_rows, 1, figsize = (14,8*plot_rows))

# For each subplot...
for c, ax in enumerate(axes):

    # Plot the appropriate components
    components.iloc[c*dpp:c*dpp+dpp].plot(ax=ax, kind='bar');
    ax.set_ylabel("Feature Weights")

    # Configure the xticks
    # Note: set_xticks is necessary for correct display of partially filled plots
    ax.set_xticks(range(dpp+1))
    ax.set_xticklabels(dimensions[c*dpp:c*dpp+dpp], rotation=0)

    # Display the explained variance ratios
    # Note: the ha and multialignment kwargs allow centering of (multiline) text
    for i, ev in enumerate(pca.explained_variance_ratio_[c*dpp:c*dpp+dpp]):
        ax.text(i, ax.get_ylim()[1] + 0.02, 
                "Explained Variance\n%.4f" % (ev), 
                ha='center', multialignment='center')

# Done
plt.show()

[1] StackOverflow:python中等效于//的天花板。