过滤csv中的特定行,跟踪最后一个值

时间:2017-06-07 15:22:58

标签: python csv

我有一个包含不同数据格式的.csv文件,我正在尝试使用同一列上的值进行操作。

我的csv文件是这样的:

ExpandoObject

示例:

DictionaryValueProvider

如果它们的差值大于x,我想在第二列中减去值。  (只有那些,我不关心其他人)。

到目前为止我的代码:

"int","float","stirng", more data

输出如下:

"2","1.378","Johnny"
"1","1.379","Walker"
"5","1.380","Jack"
"8","1.700","Daniels"
"8","1.710","Baileys"
"8","1.381","Monkey"
"8","1.711","Shoulder"
"8","1.383","Captain"
"8","1.385","Morgan"
"8","1.392","Drinks"
More rows

但我希望如此:

with open ('input.csv', 'r') as file, open ('output.csv', 'w') as f_out:
    readCSV = csv.reader(file)
    writeCSV = csv.writer(f_out, lineterminator='\n')
    last = None

    for row in readCSV:
        datalat = float(row[1])

        if last is not None:
            #print("difference -> %f" %(datalat-last))
            outp = (datalat-last)
            if outp <= 0.02:
                writeCSV.writerow(row)
            last = datalat

所以它应该做的只是写行差异小于0.02的行,如果有一个有较大差异的行丢弃它,则将下一行与最后写入的行进行比较,而不是最后一次丢弃的行。

1 个答案:

答案 0 :(得分:1)

两件事:

  1. 您应该采用差异的绝对值(使用abs),因为您不知道apriori中哪一个更大。
  2. 如果条件已满足,则仅更新last,因此last永远不会被废弃。
  3. last = float(next(readCSV)[1])  # assign first reference value
    f_out.seek(0)                   # return to start of file
    for row in readCSV:
        datalat = float(row[1])
        diff = abs(datalat-last)
        if diff <= 0.02:
            writeCSV.writerow(row)
            last = datalat