写入具有多个池的文件

时间:2013-06-29 01:48:30

标签: python file-io multiprocessing

您好,我正在查询服务器中的一些数据,我想将找到的数据存储在文件中。如果我不使用池和多处理,我的数据应该写入文件。我在这做错了什么?队列被提到,但我认为我没有正确使用它们,我无法弄清楚如何使用它们。我是多处理的新手

似乎只有最后一个值是写入文件,或者它根本不写...

from multiprocessing import Pool
import time
import os
import json
import urllib2
import pprint

start_time = time.time()
localtime = time.localtime()
unixtime = int(time.mktime(time.gmtime()))

if not os.path.exists('data'):
    os.makedirs('data')
filename = ('data/%i_%i_%i_%i_%i_%i') % (localtime[2], localtime[1], localtime[0], localtime[3], localtime[4], localtime[5])
f = open(filename, 'w')

apikey = '_omitted_'

res = 90

lines = (360 * 180) / (res * res)

basepath = 'https://api.forecast.io/forecast/'+ apikey + '/' 

def getdata(i,j):
    url = basepath + str(j) + ',' + str(i) + ',' + str(unixtime)
    data = json.load(urllib2.urlopen(url))
    lat = data['latitude']
    lon = data['longitude']
    temp = data['currently']['temperature']
    humidity = data['currently']['humidity']
    pressure = data['currently']['pressure']
    clouds = data['currently']['cloudCover']
    winddir = data['currently']['windBearing']
    windspd = data['currently']['windSpeed']    
    dewpoint = data['currently']['dewPoint']
    precip = data['currently']['precipIntensity']
    f.write(('%f %f %f %f %f %f %f %f %f %f\n') % (lat, lon, temp, humidity, pressure, clouds, winddir, windspd, dewpoint, precip))
    #because output needs to be flushed...
    f.close()
#    print('%.2f%% Complete...' % (float(count / lines) * 100))

pool = Pool(processes=1)
for i in range (-180, 180, res):
    for j in range (-90, 90, res):
#        getdata(i,j)
        pool.apply_async(getdata, (i, j))
pool.close()
pool.join()


print('Total time for Execution: %f Minutes' % ((time.time() - start_time) / 60))

当我更改进程数时,我的输出会发生变化,进程越多,我的文件中显示的数据就越多。我认为每个工人都会覆盖另一个工人或覆盖自己。似乎每个工人只能放入一行文本。

1 个答案:

答案 0 :(得分:0)

所以我找到了解决方案。这些工作人员按照自己的节奏写入文件,因此它不是很有条理,但它适用于我的目的。我认为您希望使用回调函数来写入文件,以确保在写入时该过程完成。我仍然不认为这是正确的解决方案,但它的工作原理。

def getdata(i,j):
    url = basepath + str(j) + ',' + str(i) + ',' + str(unixtime)
    data = json.load(urllib2.urlopen(url))
    lat = data['latitude']
    lon = data['longitude']
    temp = data['currently']['temperature']
    humidity = data['currently']['humidity']
    pressure = data['currently']['pressure']
    clouds = data['currently']['cloudCover']
    winddir = data['currently']['windBearing']
    windspd = data['currently']['windSpeed']    
    dewpoint = data['currently']['dewPoint']
    precip = data['currently']['precipIntensity']
    return (('%f %f %f %f %f %f %f %f %f %f\n') % (lat, lon, temp, humidity, pressure, clouds, winddir, windspd, dewpoint, precip))

def writedata(data):
    f.write(data)

pool = Pool(processes=40)
for i in range (-180, 180, res):
    for j in range (-90, 90, res):
        pool.apply_async(getdata, (i, j), callback=writedata)
pool.close()
pool.join()
f.close()