我一直遇到一个问题,即使在调用pool.terminate之后,pool.map也会离开进程。我一直在寻找解决方案,但是它们似乎都存在其他一些问题,例如递归调用map函数或其他会干扰多处理的进程。
因此,我的代码导入了2个NETCDF文件,并使用不同的计算方法处理了其中的数据。这些占用了大量时间(几个6400x6400阵列),因此我尝试对代码进行多处理。多重处理工作正常,我第一次运行代码需要2.5分钟(从8分钟开始减少),但是每次我的代码完成运行时,Spyder的内存使用量都不会减少,并且会在Windows任务管理器中留下额外的python进程。我的代码如下:
import numpy as np
import netCDF4
import math
from math import sin, cos
import logging
from multiprocessing.pool import Pool
import time
start=time.time()
format = "%(asctime)s: %(message)s"
logging.basicConfig(format=format, level=logging.INFO, datefmt="%H:%M:%S")
logging.info("Here we go!")
path = "DATAPATH"
geopath = "DATAPATH"
f = netCDF4.Dataset(path)
f.set_auto_maskandscale(False)
f2 = netCDF4.Dataset(geopath)
i5lut=f.groups['observation_data'].variables['I05_brightness_temperature_lut'][:]
i4lut=f.groups['observation_data'].variables['I05_brightness_temperature_lut'][:]
I5= f.groups['observation_data'].variables['I05'][:]
I4= f.groups['observation_data'].variables['I04'][:]
I5=i5lut[I5]
I4=i4lut[I4]
I4Quality= f.groups['observation_data'].variables['I04_quality_flags'][:]
I5Quality= f.groups['observation_data'].variables['I05_quality_flags'][:]
I3= f.groups['observation_data'].variables['I03']
I2= f.groups['observation_data'].variables['I02']
I1= f.groups['observation_data'].variables['I01']
I1.set_auto_scale(True)
I2.set_auto_scale(True)
I3.set_auto_scale(True)
I1=I1[:]
I2=I2[:]
I3=I3[:]
lats = f2.groups['geolocation_data'].variables['latitude'][:]
lons = f2.groups['geolocation_data'].variables['longitude'][:]
solarZen = f2.groups['geolocation_data'].variables['solar_zenith'][:]
sensorZen= solarZen = f2.groups['geolocation_data'].variables['sensor_zenith'][:]
solarAz = f2.groups['geolocation_data'].variables['solar_azimuth'][:]
sensorAz= solarZen = f2.groups['geolocation_data'].variables['sensor_azimuth'][:]
def kernMe(i, j, band):
if i<250 or j<250:
return -1
else:
return np.mean(band[i-250:i+250:1,j-250:j+250:1])
def thread_me(arr):
start1=arr[0]
end1=arr[1]
start2=arr[2]
end2=arr[3]
logging.info("Im starting at: %d to %d, %d to %d" %(start1, end1, start2, end2))
points = []
avg = np.mean(I4)
for i in range(start1,end1):
for j in range (start2,end2):
if solarZen[i,j]>=90:
if not (I5[i,j]<265 and I4[i,j]<295):#
if I4[i,j]>320 and I4Quality[i,j]==0:
points.append([lons[i,j],lats[i,j], 1])
elif I4[i,j]>300 and I5[i,j]-I4[i,j]>10:
points.append([lons[i,j],lats[i,j], 2])
elif I4[i,j] == 367 and I4Quality ==9:
points.append([lons[i,j],lats[i,j, 3]])
else:
if not ((I1[i,j]>I2[i,j]>I3[i,j]) or (I5[i,j]<265 or (I1[i,j]+I2[i,j]>0.9 and I5[i,j]<295) or
(I1[i,j]+I2[i,j]>0.7 and I5[i,j]<285))):
if not (I1[i,j]+I2[i,j] > 0.6 and I5[i,j]<285 and I3[i,j]>0.3 and I3[i,j]>I2[i,j] and I2[i,j]>0.25 and I4[i,j]<=335):
thetaG= (cos(sensorZen[i,j]*(math.pi/180))*cos(solarZen[i,j]*(math.pi/180)))-(sin(sensorZen[i,j]*(math.pi/180))*sin(solarZen[i,j]*(math.pi/180))*cos(sensorAz[i,j]*(math.pi/180)))
thetaG= math.acos(thetaG)*(180/math.pi)
if not ((thetaG<15 and I1[i,j]+I2[i,j]>0.35) or (thetaG<25 and I1[i,j]+I2[i,j]>0.4)):
if math.floor(I4[i,j])==367 and I4Quality[i,j]==9 and I5>290 and I5Quality[i,j]==0 and (I1[i,j]+I2[i,j])>0.7:
points.append([lons[i,j],lats[i,j, 4]])
elif I4[i,j]-I5[i,j]>25 or True:
kern = kernMe(i, j, I4)
if kern!=-1 or True:
BT4M = max(325, kern)
kern = min(330, BT4M)
if I4[i,j]> kern and I4[i,j]>avg:
points.append([lons[i,j],lats[i,j], 5])
return points
if __name__ == '__main__':
#Separate the arrays into 1616*1600 chunks for multi processing
#TODO: make this automatic, not hardcoded
arg=[[0,1616,0,1600],[0,1616,1600,3200],[0,1616,3200,4800],[0,1616,4800,6400],
[1616,3232,0,1600],[1616,3232,1600,3200],[1616,3232,3200,4800],[1616,3232,4800,6400],
[3232,4848,0,1600],[3232,4848,1600,3200],[3232,4848,3200,4800],[3232,4848,4800,6400],
[4848,6464,0,1600],[4848,6464,1600,3200],[4848,6464,3200,4800],[4848,6464,4800,6400]]
print(arg)
p=Pool(processes = 4)
output= p.map(thread_me, arg)
p.close()
p.join()
print(output)
f.close()
f2.close()
logging.info("Aaaand we're here!")
print(str((time.time()-start)/60))
p.terminate()
我同时使用p.close和p。终止是因为我认为这会有所帮助(没有帮助)。我所有的代码都运行并产生预期的输出,但我必须使用任务管理器手动结束徘徊的过程。关于的任何想法 是什么原因造成的?
我想我把所有相关信息都放在这里,如果您需要更多信息,我会根据要求进行编辑
谢谢。