Question

我正在尝试为我的数据帧中的所有行（约100K行）求解两个等式的交集：y=Rx^1.75和y=ax^2+bx+c。 R,a,b,c的每个值对于每一行都是不同的。我可以通过遍历数据帧并为每一行调用fsolve()来逐个解决它们（如下所示），但我想知道是否有更好的方法来执行此操作。

我的问题是：是否可以将其转换为数组计算，即一次解决所有行？关于如何更快地完成计算的任何想法都会非常有用。

以下是具有系数

的示例数据帧

     R     a     b      c
0  0.5 -0.01 -0.50  32.42
1  0.6  0.00  0.07  14.12
2  0.7 -0.01 -0.50  32.42

这是我用来测试方法的工作示例代码：

import numpy as np
import pandas as pd
from scipy.optimize import *

# The fSolve function
def myFunction(zGuess,*Params):
    # Get the coefficients
    R,a,b,c = Params
    # Get the initial guess
    x,y = zGuess

    F = np.empty((2))
    F[0] = R*x**1.75-y
    F[1] = a*x**2+b*x+c-y
    return F

# Example Dataframe that is 10K rows of different coefficients 
df = pd.DataFrame({"R":[0.500, 0.600,0.700],
                   "a":[-0.01, 0.000,-0.01],
                   "b":[-0.50, 0.070,-0.50],
                   "c":[32.42, 14.12,32.42]})
# Initial guess
zGuess = np.array([50,50])

# Make a place to store the answers
df["x"] = None
df["y"] = None

# Loop through the rows?
for index, coeffs in df.iterrows():
    # Get the coefficients
    Params = (coeffs["R"],coeffs["a"],coeffs["b"],coeffs["c"])
    # fSolve
    z = fsolve(myFunction,zGuess,args=Params)
    # Set the answers
    df.loc[index,"x"] = z[0]
    df.loc[index,"y"] = z[1]

print df

============================================

解决方案（答案更快）：

我在下面得到两个答案，两个答案都给出了数学上正确的答案。所以在这一点上，所有人的计算都更快！测试数据帧将为3K行。

回答＃1（牛顿法）

# Solution 1
import numpy as np
import pandas as pd
Count = 1000
df = pd.DataFrame({"R":[0.500, 0.600,0.700]*Count,
                   "a":[-0.01, 0.000,-0.01]*Count,
                   "b":[-0.50, 0.070,-0.50]*Count,
                   "c":[32.42, 14.12,32.42]*Count})

from datetime import datetime
t_start = datetime.now()
#---------------------------------
InitialGuess = 50.0
Iterations = 20
x = np.full(df["a"].shape, InitialGuess)

for i in range(Iterations):
    x = x - (-df["R"]*x**1.75 + df["a"]*x**2 + df["b"]*x + df["c"])/(-1.75*df["R"]*x**0.75 + 2*df["a"]*x + df["b"])

df["x"] = x
df["y"] = df["R"]*x**1.75
df["x Error"] = df["a"]*x**2 + df["b"]*x + df["c"] - df["R"]*x**1.75
#---------------------------------
t_end = datetime.now()
print ('\n\n\nTime spent running this was:')
print(t_end - t_start)
print df

花的时间是：

Time spent running this was:
0:00:00.015000

回答＃2（fSolve）

# Solution 2
import numpy as np
import pandas as pd
from scipy.optimize import *
Count = 1000
df = pd.DataFrame({"R":[0.500, 0.600,0.700]*Count,
                   "a":[-0.01, 0.000,-0.01]*Count,
                   "b":[-0.50, 0.070,-0.50]*Count,
                   "c":[32.42, 14.12,32.42]*Count})

from datetime import datetime
t_start = datetime.now()
#---------------------------------
coefs = df.values[:, 0:4]

def mfun(x, *args):
    args = np.array(args[0], dtype=np.float64)
    return args[:,1] * x**2 + args[:,2] * x + args[:,3] - args[:,0] * x**1.75


nrows = coefs.shape[0]
df["x"] = fsolve(mfun, np.ones(nrows) * 50, args=coefs)
df["y"] = coefs[:, 0] * df["x"]**1.75
#---------------------------------
t_end = datetime.now()
print ('\n\n\nTime spent running this was:')
print(t_end - t_start)
print df

花的时间是：

Time spent running this was:
0:00:35.786000

最后的想法：

对于这种特殊情况，牛顿方法要快得多（我可以在0:00:01.139000中运行300K行！）。谢谢你们两个！

Answer 1

也许你可以使用牛顿的方法：

import numpy as np

data = np.array(
    [[0.5, -0.01, -0.50,  32.42],
     [0.6,  0.00,  0.07,  14.12],
     [0.7, -0.01, -0.50,  32.42]])


R, a, b, c = data.T

x = np.full(a.shape, 10.0)
m = 1.0
for i in range(20):
    x = x - m * (-R*x**1.75 + a*x**2 + b*x + c)/(-1.75*R*x**0.75 + 2*a*x + b)

print(a*x**2 + b*x + c - R * x**1.75)

输出：

[  0.00000000e+00   1.77635684e-15   3.55271368e-15]

注意选择迭代计数和x的初始值。

Answer 2

你可以摆脱一个变量，然后使用Numpy的阵列广播：

# Your `df`:
#R  a   b   c   x   y
#0  0.5 -0.01   -0.50   32.42   9.69483 26.6327
#1  0.6 0.00    0.07    14.12   6.18463 14.5529
#2  0.7 -0.01   -0.50   32.42   8.17467 27.6644

# Solved in one go
coefs = df.values[:, 0:4]

def mfun(x, *args):
    args = np.array(args[0], dtype=np.float64)
    return args[:,1] * x**2 + args[:,2] * x + args[:,3] - args[:,0] * x**1.75


nrows = coefs.shape[0]
x = fsolve(mfun, np.ones(nrows) * 50, args=coefs)

y = coefs[:, 0] * x**1.75
x, y
#(array([ 9.69482605,  6.18462999,  8.17467496]),
#array([26.632690454652423, 14.552924099681404, 27.66440941242009], dtype=object))

将一组非线性方程组解为一个数组

解决方案（答案更快）：

最后的想法：

2 个答案: