编辑许多CSV文件时出错

时间:2013-08-22 16:15:14

标签: python csv counter glob

我几天前问过这个问题,因为我需要帮助来编辑一些CSV文件:Fix numbering on CSV files that have deleted lines。 Stack Overflow的人帮了我很多但是我一直收到错误说AttributeError: 'int' object has no attribute 'strip'.问题是我的CSV文件中的所有信息都不是整数。作为我的蟒蛇新手,几天试图修复它只会让事情变得更糟。以下是我之前提出的问题所带来的错误:

import csv
import glob
import os 
import re 

numbered = re.compile(r'N\d+').match

for fn in fns:
     # open for counting
     reader = csv.reader(open(fn,"rb"))
     count = sum(1 for row in reader if row and not any(r.strip() == 'DIF' for r in     row) and numbered(row[0]))

 # reopen for filtering
 reader = csv.reader(open(fn,"rb"))

 with open (os.path.join('out', fn), 'wb') as f:
    counter = 0
    w = csv.writer(f)
    for row in reader:
        if row and 'Count' in row[0].strip():
            row = ['Count', count]
        if row and not any(r.strip() == 'DIF' for r in row): #remove DIF
            if numbered(row[0]):
                counter += 1
                row[0] = 'N%d' % counter
        w.writerow(row)

代码基本上应该通过一堆CSV文件运行并删除其中包含“DIF”的所有行,并修复由于删除行而导致的编号。有没有人有任何建议?

1 个答案:

答案 0 :(得分:0)

最简单的可能是在str()中包装r。但与此同时,为什么不在一次只读文件,让它变得更容易:

import csv
import glob
import os
import re

numbered = re.compile(r'N\d+').match

for fn in fns:
     reader = csv.reader(open(fn,"rb"))

     # filter out 'DIF' rows here
     rows = [ row for row in reader 
              if not any(str(r).strip() == 'DIF' 
                   for r in row) ]

     # count numbered rows
     count = len([row for row in rows if row and numbered(row[0])])

     with open (os.path.join('out', fn), 'wb') as f:
        counter = 0
        w = csv.writer(f)

        for row in rows:
            if row and 'Count' in row[0].strip():
                row = ['Count', count]

            if row and numbered(row[0]):
                counter += 1
                row[0] = 'N%d' % counter

            w.writerow(row)