我正在尝试评估一个大的分段多项式中的点,这是从三次样条求得的。我试图在GPU上执行此操作,但我遇到了内存限制。
因此,我想分批评估分段多项式。
原始代码:
Y = some_matrix_of_data_values ;
X = some_vector_of_data_sites ;
pp = spline(X, Y) ; % get the piecewise polynomial form of the cubic spline. The resulting structure is very large.
for t = 1: big_number
hcurrent = ppval(pp,t); %evaluate the piecewise polynomial at t
y(t) = sum(x(t:t+M-1).*hcurrent,1) ; % do some operation of the interpolated value. Most likely not relevant to this question.
end
矢量化,希望在进行GPU批处理的过程中:
Y = some_matrix_of_data_values ;
X = some_vector_of_data_sites ;
pp = spline(X, Y) ; % get the piecewise polynomial form of the cubic spline. Resulting structure is very large.
batchSize = 1024 ;
for tt = 1: batchSize: big_number
if tt > big_number - batchSize % snatch up any remaining values at the end of the loop, and calculate those as well
batchSize = big_number - tt ;
end
hcurrent = ppval(pp ,(tt:tt+batchSize-1) ) ; %evaluate pp at a couple of data sites
ind = bsxfun(@plus, 1:M, (tt-1:1:tt+batchSize-2).')) ; %make an index matrix to help with next calculation. Most likely not relevant to this question.
y(tt:tt+batchSize-1) = sum( x(ind).*hcurrent' , 2 ) ; % do some calculation, but now we have done it in batches!
end
在修订后的代码中,分段多项式在多个数据站点进行评估,因此我们至少在那里。分段多项式pp
太大而无法存储在GPU上,有没有办法将其分解为批量处理?