我有16个文件,它们是.xlsx(a1.xlsx…a16.xlsx),每个文件包含54列和n行。我想在每个文件中选择相同的两列,将这些对绘制为散点图,并对这些对实施简单的线性回归。而且,我想在与4 x 4子图相同的图上绘制这些单独的16个图形(plot1-X1vsY1,Plot2-X2vsY2,…,plot16-X16vsY16)。
我在python中有一个脚本,可以一次绘制一个文件的图(脚本1)。另外,我还有另一个脚本可以将我的文件名放在当前文件夹的列表中(脚本2)。如何组合这些脚本以在同一图形上绘制所有文件?
脚本1:
# Importing modules
import pandas as pd
import matplotlib.pyplot as plt
from scipy import stats
#
def my_plotter(ax, data1, data2):
"""
A helper function to make a scatter graph
Parameters
----------
ax : Axes
The axes to draw to
data1 : array
The x data
data2 : array
The y data
Returns
-------
out : list
list of artists added
"""
stdev = data2.std()
out = ax.errorbar(data1, data2, yerr=stdev, ls='', marker='o',
markerfacecolor='w', markeredgecolor='k', capsize=0.5, capthick=0, ecolor='darkgray')
ax.set_xlim(xmin=0, xmax=220)
ax.set_ylim(ymin=0, ymax= 0.16)
return out
#
def my_fit(ax, data1, data2):
"""
A helper function to fit a linear model which is formed y = m*x + n
Parameters
----------
ax : Axes
The axes to draw to
data1 : array
The x data
data2 : array
The y data
Returns
-------
out : list
list of artists added
"""
# Firstly, return a copy of the array collapsed into one dimension.
data1 = data1.flatten()
data2 = data2.flatten()
## Using scipy to be able to conduct a regression analysis on two arrays (x and y)
## in order to determine the intercept and slope, along with the p-value,
## R-Squared value, and standard error.
slope, intercept, r_value, p_value, std_err = stats.linregress(data1,data2)
line = slope * data1 + intercept
out = ax.plot(data1, line, c=(0, 0, 0), lw=2, label= r'$\kappa(r)$' ' = {:.3E} x $r$ + {:.3f}'.format(slope,intercept))
plt.xlim(0, 220)
plt.ylim(0, 0.16)
return out
#
fig, ax = plt.subplots(nrows=1, ncols=4, sharex=True, sharey=True, squeeze=False, figsize=(16,4))
my_plotter(ax[0,0], x, y1)
my_fit(ax[0,0], x, y1)
my_plotter(ax[0,1], x, y2)
my_fit(ax[0,1], x, y2)
my_plotter(ax[0,2], x, y3)
my_fit(ax[0,2], x, y3)
my_plotter(ax[0,3], x, y4)
my_fit(ax[0,3], x, y4)
for i, ax in enumerate(ax.flat):
...
脚本2:
for f in sorted(glob.glob('*.xlsx')):
df = pd.read_excel(f, sheetname=1)
# Select variables for drawing scattering plots
x = df[['epi.(km)']].values
y = df[['kh']].values
plt.scatter(x,y)
...
fig, axes = plt.subplots(4,4, figsize=(10,6), sharey=True, dpi=120)
# Plot each axes
for i, ax in enumerate(axes.ravel()):
ax.scatter(x, y)