我可以使用numpy.polynomial
使术语符合f(x) = 1 + x + x^2
等一维多项式。如何拟合多维多项式,如f(x,y) = 1 + x + x^2 + y + yx + y x^2 + y^2 + y^2 x + y^2 x^2
?看起来numpy根本不支持多维多项式:是这样的吗?在我的实际应用中,我有5个输入维度,我对hermite多项式感兴趣。看起来scipy.special
中的多项式也只适用于一维输入。
# One dimension of data can be fit
x = np.random.random(100)
y = np.sin(x)
params = np.polynomial.polynomial.polyfit(x, y, 6)
np.polynomial.polynomial.polyval([0, .2, .5, 1.5], params)
array([ -5.01799432e-08, 1.98669317e-01, 4.79425535e-01,
9.97606096e-01])
# When I try two dimensions, it fails.
x = np.random.random((100, 2))
y = np.sin(5 * x[:,0]) + .4 * np.sin(x[:,1])
params = np.polynomial.polynomial.polyvander2d(x, y, [6, 6])
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-13-5409f9a3e632> in <module>()
----> 1 params = np.polynomial.polynomial.polyvander2d(x, y, [6, 6])
/usr/local/lib/python2.7/site-packages/numpy/polynomial/polynomial.pyc in polyvander2d(x, y, deg)
1201 raise ValueError("degrees must be non-negative integers")
1202 degx, degy = ideg
-> 1203 x, y = np.array((x, y), copy=0) + 0.0
1204
1205 vx = polyvander(x, degx)
ValueError: could not broadcast input array from shape (100,2) into shape (100)
答案 0 :(得分:2)
我很生气,对于任意数量的2D多项式拟合,没有简单的函数,所以我自己编写了一个函数。像其他答案一样,它使用numpy lstsq查找最佳系数。
import numpy as np
from scipy.linalg import lstsq
from scipy.special import binom
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
def _get_coeff_idx(coeff):
idx = np.indices(coeff.shape)
idx = idx.T.swapaxes(0, 1).reshape((-1, 2))
return idx
def _scale(x, y):
# Normalize x and y to avoid huge numbers
# Mean 0, Variation 1
offset_x, offset_y = np.mean(x), np.mean(y)
norm_x, norm_y = np.std(x), np.std(y)
x = (x - offset_x) / norm_x
y = (y - offset_y) / norm_y
return x, y, (norm_x, norm_y), (offset_x, offset_y)
def _unscale(x, y, norm, offset):
x = x * norm[0] + offset[0]
y = y * norm[1] + offset[1]
return x, y
def polyvander2d(x, y, degree):
A = np.polynomial.polynomial.polyvander2d(x, y, degree)
return A
def polyscale2d(coeff, scale_x, scale_y, copy=True):
if copy:
coeff = np.copy(coeff)
idx = _get_coeff_idx(coeff)
for k, (i, j) in enumerate(idx):
coeff[i, j] /= scale_x ** i * scale_y ** j
return coeff
def polyshift2d(coeff, offset_x, offset_y, copy=True):
if copy:
coeff = np.copy(coeff)
idx = _get_coeff_idx(coeff)
# Copy coeff because it changes during the loop
coeff2 = np.copy(coeff)
for k, m in idx:
not_the_same = ~((idx[:, 0] == k) & (idx[:, 1] == m))
above = (idx[:, 0] >= k) & (idx[:, 1] >= m) & not_the_same
for i, j in idx[above]:
b = binom(i, k) * binom(j, m)
sign = (-1) ** ((i - k) + (j - m))
offset = offset_x ** (i - k) * offset_y ** (j - m)
coeff[k, m] += sign * b * coeff2[i, j] * offset
return coeff
def plot2d(x, y, z, coeff):
# regular grid covering the domain of the data
if x.size > 500:
choice = np.random.choice(x.size, size=500, replace=False)
else:
choice = slice(None, None, None)
x, y, z = x[choice], y[choice], z[choice]
X, Y = np.meshgrid(
np.linspace(np.min(x), np.max(x), 20), np.linspace(np.min(y), np.max(y), 20)
)
Z = np.polynomial.polynomial.polyval2d(X, Y, coeff)
fig = plt.figure()
ax = fig.gca(projection="3d")
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, alpha=0.2)
ax.scatter(x, y, z, c="r", s=50)
plt.xlabel("X")
plt.ylabel("Y")
ax.set_zlabel("Z")
plt.show()
def polyfit2d(x, y, z, degree=1, max_degree=None, scale=True, plot=False):
"""A simple 2D polynomial fit to data x, y, z
The polynomial can be evaluated with numpy.polynomial.polynomial.polyval2d
Parameters
----------
x : array[n]
x coordinates
y : array[n]
y coordinates
z : array[n]
data values
degree : {int, 2-tuple}, optional
degree of the polynomial fit in x and y direction (default: 1)
max_degree : {int, None}, optional
if given the maximum combined degree of the coefficients is limited to this value
scale : bool, optional
Wether to scale the input arrays x and y to mean 0 and variance 1, to avoid numerical overflows.
Especially useful at higher degrees. (default: True)
plot : bool, optional
wether to plot the fitted surface and data (slow) (default: False)
Returns
-------
coeff : array[degree+1, degree+1]
the polynomial coefficients in numpy 2d format, i.e. coeff[i, j] for x**i * y**j
"""
# Flatten input
x = np.asarray(x).ravel()
y = np.asarray(y).ravel()
z = np.asarray(z).ravel()
# Remove masked values
mask = ~(np.ma.getmask(z) | np.ma.getmask(x) | np.ma.getmask(y))
x, y, z = x[mask].ravel(), y[mask].ravel(), z[mask].ravel()
# Scale coordinates to smaller values to avoid numerical problems at larger degrees
if scale:
x, y, norm, offset = _scale(x, y)
if np.isscalar(degree):
degree = (int(degree), int(degree))
degree = [int(degree[0]), int(degree[1])]
coeff = np.zeros((degree[0] + 1, degree[1] + 1))
idx = _get_coeff_idx(coeff)
# Calculate elements 1, x, y, x*y, x**2, y**2, ...
A = polyvander2d(x, y, degree)
# We only want the combinations with maximum order COMBINED power
if max_degree is not None:
mask = idx[:, 0] + idx[:, 1] <= int(max_degree)
idx = idx[mask]
A = A[:, mask]
# Do the actual least squares fit
C, *_ = lstsq(A, z)
# Reorder coefficients into numpy compatible 2d array
for k, (i, j) in enumerate(idx):
coeff[i, j] = C[k]
# Reverse the scaling
if scale:
coeff = polyscale2d(coeff, *norm, copy=False)
coeff = polyshift2d(coeff, *offset, copy=False)
if plot:
if scale:
x, y = _unscale(x, y, norm, offset)
plot2d(x, y, z, coeff)
return coeff
if __name__ == "__main__":
n = 100
x, y = np.meshgrid(np.arange(n), np.arange(n))
z = x ** 2 + y ** 2
c = polyfit2d(x, y, z, degree=2, plot=True)
print(c)
答案 1 :(得分:1)
我相信你误解了polyvander2d
做了什么以及应该如何使用它。 polyvander2d()
返回度deg
的伪Vandermonde矩阵和样本点(x, y)
。
这里,y
不是点x
处的多项式的值,而是y
- 点的坐标和{ {1}}是x
- 坐标。粗略地说,返回的数组是x
和(x**i) * (y**j)
的组合,x
本质上是2D“网格”。因此,y
和x
必须具有相同的形状。
但是,y
和x
数组的形状不同:
y
我不相信>>> x.shape
(100, 2)
>>> y.shape
(100,)
有5D - numpy
形式polyvander
。注意,这里的所有变量都是坐标,而不是多项式polyvander5D(x, y, z, v, w, deg)
的值。但是,您似乎将p=p(x,y,z,v,w)
(在2D情况下)用作y
。
似乎f
没有numpy
函数的2D或更高等价物。如果您打算在更高维度中找到最佳拟合多项式的系数,我建议您概括一下此处描述的方法:Equivalent of `polyfit` for a 2D polynomial in Python
答案 2 :(得分:1)
看起来polyfit不支持拟合多元多项式,但你可以用linalg.lstsq
手工完成。步骤如下:
收集您希望在模型中使用的单项式x**i * y**j
的度数。仔细考虑一下:你当前的模型已经有9个参数,如果你要推送到5个变量,那么用当前的方法你最终会得到3 ** 5 = 243个参数,这是过度拟合的必然之路。也许限制__total_度的单项式最多2或3 ......
将x点插入每个单项;这给出了1D阵列。将所有此类数组堆叠为矩阵的列。
用前面提到的矩阵解决一个线性系统,右边是目标值(我称之为z,因为当你还使用x,y作为两个变量时,y会让人困惑)。
这是:
import numpy as np
x = np.random.random((100, 2))
z = np.sin(5 * x[:,0]) + .4 * np.sin(x[:,1])
degrees = [(i, j) for i in range(3) for j in range(3)] # list of monomials x**i * y**j to use
matrix = np.stack([np.prod(x**d, axis=1) for d in degrees], axis=-1) # stack monomials like columns
coeff = np.linalg.lstsq(matrix, z)[0] # lstsq returns some additional info we ignore
print("Coefficients", coeff) # in the same order as the monomials listed in "degrees"
fit = np.dot(matrix, coeff)
print("Fitted values", fit)
print("Original values", y)
答案 3 :(得分:0)
选项不存在,因为没有人愿意这样做。线性地组合多项式(f(x,y) = 1 + x + y + x^2 + y^2
)并自己求解方程组。