使用python读取csv中的最后一行有效行

时间:2016-07-09 06:17:36

标签: python csv tail

我正在使用python从我的csv文件中读取。但是我想只读取csv中尾部的特定(最后有效)行,还有一个catch函数应该只在有效时返回整行。任何人都可以帮我解决这个问题吗?

下面是我的csv文件,如下所示:

Sr.       Add             A       B       C         D
0   0013A20040D6A141    -308.1  -307.6  -307.7  -154.063    
1   0013A20040DC889A    -308.7  -311.7  -311.7  -154.263    
2   0013A20040DC88C3    -310.1  -310.1  -310.2  -154.863    
3   0013A20040D6A141    -308.2  -306.8  -307.7  -153.863    
4   0013A20040DC889A    -308.7  -311.4  -311.1  -153.263    
5   0013A20040DC88C3      --      --      --       --   
6   0013A20040D6A141    -308.7  -308.3  -305.2  -154.663    

我正在尝试的代码是:

def last_data(address):
    i = sum(1 for line in open("filename.csv", 'r'))
    print i # number of lines in csv
    cache = {} # dict that saved the last data for particular address
    n = 3

    with open("filename.csv",'r') as f:
        q = deque(f, 3)  # 3 lines read at the end
        qp = [''] * n
        if i +1  >=  n:  # for checking whether the number of lines greater than number of add.
            for k in range(n):

                qp[k] = q[k].split(',')

                if address == str(qp[k][1]): # check for particular address in row
                  # if the row has data than put it into json object with address as key and nested key as columns 'A', 'B', etc.      
                    cache.update({address: {'A':struct.pack('>l',int(float(qp[k][3]) * 10)),
                                            'C':struct.pack('>l',int(float(qp[k][4]) * 10))
                                            }})

                    return cache[address]['A'], cache[address]['C']

对于last_data('0013A20040DC88C3')返回包含无效数据的第5行,我想显示第2行。任何人都可以告诉我该怎么做吗?

1 个答案:

答案 0 :(得分:1)

使用pandas,它看起来像这样:

注意:python 2.7。码。在Python3上更改StringIo的导入。

import pandas as pd
from StringIO import StringIO

input = """Sr.       Add             A       B       C         D
0   0013A20040D6A141    -308.1  -307.6  -307.7  -154.063    
1   0013A20040DC889A    -308.7  -311.7  -311.7  -154.263    
2   0013A20040DC88C3    -310.1  -310.1  -310.2  -154.863    
3   0013A20040D6A141    -308.2  -306.8  -307.7  -153.863    
4   0013A20040DC889A    -308.7  -311.4  -311.1  -153.263    
5   0013A20040DC88C3      --      --      --       --   
6   0013A20040D6A141    -308.7  -308.3  -305.2  -154.663 
"""

buffer = StringIO(input)

df = pandas.read_csv(buffer, delim_whitespace=True, na_values=["--"])

# you can customize the behaviour here, e.g. how many invalid values are ok per row.
# see http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.dropna.html

df = df.dropna()