我正在尝试在几个巨大的CSV文件中找到单词NIL的出现,并将其替换为空字符串。我已经找到了解决方案,但我试过的那个不起作用,因为该行是一个列表而我发现的其他一些似乎是具体位置但我不知道在哪里NIL将出现,因为文件总是在变化。
我的代码:
import Tkinter, tkFileDialog, os, csv
root = Tkinter.Tk()
root.withdraw()
dirname = tkFileDialog.askdirectory(parent=root,initialdir="/",title='Please select a directory')
for subdir, dirs, files in os.walk(dirname):
for file in files:
with open (os.path.join(subdir, file), 'rb') as csvfile:
#Check if the file has headers#
if 'Result : Query Result' in csvfile.readline():
with open(os.path.join(subdir, os.path.splitext(file)[0] + '_no_headers_no_nil.csv'), 'wb') as out:
reader = csv.reader(csvfile.readlines()[6:], delimiter=',')
writer = csv.writer(out)
for row in reader:
#replace NIL occurrences with empty strings
row = row.replace('NIL', '')
separated = row.split(',')
writer.writerow(row)
else:
#The file doesn't have headers
#find and replace NIL occurrences goes here
print 'file skipped ' + file + ': No headers found'
以下是CSV类型的示例
答案 0 :(得分:2)
使用try / except如果 Nil 不是每一行都获得索引而只是分配给一个空字符串:
try:
row[row.index("NIL")] = ""
except IndexError:
pass
index 会找到Nil在您的列表中的位置,一旦您拥有该任务将替换它:
In [9]: lst = ["NIL", "foo"]
In [10]: lst[lst.index("NIL")] = ""
In [11]: lst
Out[11]: ['', 'foo']
由于每行可以有多个NIL
个字符串,因此需要遍历每个元素:
row[:] = [ele if ele != "NIL" else "" for ele in row]
此外,您不需要调用readlines,您可以使用 itertools.islice 从第n行开始:
from itertools import islice
reader = csv.reader(islice(csvfile, 6, None), delimiter=',')