如何创建Python函数,将7个日常文件中的值汇总到一个每周文件中?

时间:2014-05-02 01:27:40

标签: python file python-2.7 sum compare

我有空格分隔的文件,其中包含不同LAT / LON位置的电台的每日降水值。每日文件的格式如下:

LAT LON PRCP

22.0 110.4 1.2

23.0 121.0 0.0

23.0 122.0 0.1

第一个字段等于纬度,第二个字段等于经度,第三个字段等于每日总降水量。

我希望创建一个每周文件,使用相同的格式编译该周每个每日文件的总数...但我遇到了问题。这对我来说甚至可能稍微有点棘手的原因是每个每日文件可能没有所有位置,这意味着行数可能不同而且我不能简单地将行的每个文件行中的TOTAL PRCP字段添加到每周文件中它可能不匹配所有的日子。

我当前的方法是打开每个文件,遍历每一行,并将每个字段设置为变量,然后与第二个每日文件的变量进行比较,如果LAT和LON,则用两个降水值的总和写一行字段匹配...然后与第二天相比,每天执行此操作并写入“sum”文件。

   with open(sundayFile, "r") as sundayFile:
    with open(mondayFile, "r") as mondayFile:
        with open(addMex1, "a") as addFile:

            print "\n\nNow checking Sunday File: " + str(sundayFile) + " and Monday File: " + str(mondayFile) + "\n\n"

            for lineA in sundayFile:
                parsedLineA = lineA.split()
                LAT_A = parsedLineA[0]
                LON_A = parsedLineA[1]
                TOTAL_PRCP_A = parsedLineA[2]

                print "Line in Sunday File: " + LAT_A + "," + LON_A + "," + TOTAL_PRCP_A + "\n"

                for lineB in mondayFile:
                    parsedLineB = lineB.split()
                    LAT_B = parsedLineB[0]
                    LON_B = parsedLineB[1]
                    TOTAL_PRCP_B = parsedLineB[2]

                    print "Line in Monday File: " + LAT_B + "," + LON_B + "," + TOTAL_PRCP_B + "\n"


                    if LAT_A == LAT_B and LON_A == LON_B:
                        print "\n***** Found a match for station at longitude of " + LON_A + " and latitude of " + LAT_A + "\n"
                        LAT = LAT_A
                        LON = LON_A
                        TOTAL_PRCP = str(float(TOTAL_PRCP_A) + float(TOTAL_PRCP_B))

                        addFile.write(LAT + "," + LON + "," + TOTAL_PRCP + "\n")


                    else:
                        addFile.write(LAT_A + "," + LON_A + "," + TOTAL_PRCP_A + "\n")
                        addFile.write(LAT_B + "," + LON_B + "," + TOTAL_PRCP_B + "\n")

这不是真的有效,我终于放弃了手动尝试我的结束......必须有一个pythonic,优雅的方式来执行此操作。任何帮助都非常感谢!

1 个答案:

答案 0 :(得分:1)

使用defaultdict来保存累积的沉淀量更简单。这个词典的关键是有序的纬度和经度。这就是诀窍:

from collections import defaultdict

files = ['sunday.txt', 'monday.txt', 'tuesday.txt', 'wednesday.txt', 
         'thursday.txt', 'friday.txt', 'saturday.txt'
]

totals = defaultdict(float)

for fn in files:
    with open(fn) as f:
        for line in f.readlines():
            lat, long, prec = line.split()  # strings
            totals[(lat, long)] += float(prec)

# See what we have:
import pprint
pprint.pprint(totals)

以下是一些示例数据:

monday.txt
----------
22.0 110.4 3.2
23.0 121.0 1.0
23.0 122.0 0.2
24.0 122.0 1.0

tuesday.txt
-----------
22.0 110.4 1.0

wednesday.txt
-------------
23.0 122.0 0.3

thursday.txt
------------
24.0 122.0 1.0
25.0 1.0 1.0

friday.txt
----------
24.0 122.0 1.1

saturday.txt
------------
23.0 121.0 10.5

以上是这些文件的输出:

{('22.0', '110.4'): 5.4,
 ('23.0', '121.0'): 11.5,
 ('23.0', '122.0'): 0.6000000000000001,
 ('24.0', '122.0'): 3.1,
 ('25.0', '1.0'): 1.0}

我还没有采取额外步骤将汇总数据写入相同格式的文件中 - 我将其留作练习;)