Question

我正在阅读巨大的csv文件，并在最后一行中输入日期和时间。我在下面写了代码并寻找增强或优化的解决方案。

以下是我的数据：

2067458，XXXXXXXXXX，1006386,100.79.94.1，XXXX4,1,0,0,1,0,1， “XXXXX”，即时， “未知”，未知，未知，未知，_ROUTER_HAS_NO_RADIO_，未知，未知，未知，未知，2017-01-24,16：03：43 ,,,,,,, ,,,,

这是我的代码

import csv
import datetime
import re

input_file = 'input22.csv'
output_file= 'temp.csv'

def main():
    with open(input_file,"r") as fileHandle:
         CSVreader = fileHandle.readlines()
         fileHandle.close()
         reader  = CSVreader[-1]

    with open ('temp.csv',"w") as fileHandle:
         fileHandle.write(reader)
         fileHandle.close()

    with open('temp.csv') as temp_file:
         readCSV = csv.reader(temp_file, delimiter=',')
         for row in readCSV:
             Date=(row[22])
             Time=(row[23])
             D=Date.strip()
             T=Time.strip()
             print ("{} {}".format(D,T))


main()

Answer 1

我用你的代码直接看到了一些问题 - 当你使用with块打开它时，无需关闭文件。使用上下文的重点是，只要您离开块，文件就会关闭。

例如

with open ('temp.csv',"w") as fileHandle:
     fileHandle.write(reader)
     fileHandle.close()

应该是;

with open ('temp.csv',"w") as fileHandle:
     fileHandle.write(reader)

那就是它！ Python处理为您关闭文件。

接下来，请勿使用fh.readlines()。这会将整个文件读入内存，如果文件太大而无法放入内存，可能会导致计算机崩溃。相反，iterate over the file, as per the documentation。在这种情况下，看起来像;

with open(input_file, "r") as fileHandle:
     CSVreader = csv.reader(fileHandle)
     for row in CSVreader:
         # do something with the row

Python会自动缓冲您的读取，并且只会立即将一小部分文件保留在内存中。附注：您目前拥有它的方式，您将整个文件读入CSVreader，这应该只是被称为rows或其他内容，因为这是不是一个reader对象。

最后，在第一个实例中无需将模式传递给open()调用 - 您可以使用第二次打开temp.csv时使用的相同语法。

我相信这个（未经测试的）片段可以完成与您相同的事情，更简洁有效。

import csv
import datetime
import re

input_file = 'input22.csv'
output_file= 'temp.csv'

def main():
    last = list()  # scope this variable so that we may use it after the files have been closed
    with open(input_file) as input_fh, \
            open('temp.csv', 'w') as output_fh:
        reader = csv.reader(input_fh)
        writer = csv.writer(output_fh)
        # discard everything except the last line of output
        for row in reader:
            last = row
        writer.write(last)

    # print that last row with formatting
    D=last[22].strip()
    T=last[23].strip()
    print ("{} {}".format(D,T))

main()

Python 3读写csv文件并使用dict打印行

1 个答案: