我正在尝试从csv文件中的字符串中删除一些子字符串。
import csv
import string
input_file = open('in.csv', 'r')
output_file = open('out.csv', 'w')
data = csv.reader(input_file)
writer = csv.writer(output_file,quoting=csv.QUOTE_ALL)# dialect='excel')
specials = ("i'm", "hello", "bye")
for line in data:
line = str(line)
new_line = str.replace(line,specials,'')
writer.writerow(new_line.split(','))
input_file.close()
output_file.close()
所以对于这个例子:
hello. I'm obviously over the moon. If I am being honest I didn't think I'd get picked, so to get picked is obviously a big thing. bye.
我希望输出为:
obviously over the moon. If I am being honest I didn't think I'd get picked, so to get picked is obviously a big thing.
但这只适用于搜索单个单词时。所以特价=“我是”例如。我是否需要将我的单词添加到列表或数组中?
答案 0 :(得分:0)
您似乎已经通过csv.reader
拆分了输入,但是通过将拆分线重新转换为字符串,您就可以抛弃所有的善意。最好不要这样做,而是继续使用从csv阅读器中获得的列表。所以,它变成了这样的东西:
for row in data:
new_row = [] # A place to hold the processed row data.
# look at each field in the row.
for field in row:
# remove all the special words.
new_field = field
for s in specials:
new_field = new_field.replace(s, '')
# add the sanitized field to the new "processed" row.
new_row.append(new_field)
# after all fields are processed, write it with the csv writer.
writer.writerow(new_row)
答案 1 :(得分:0)
看起来你并没有通过特殊事物进行迭代,因为它是一个元组而不是一个列表,所以它只抓取其中一个值。试试这个:
specials = ["i'm, "hello", "bye"]
for line in data:
new_line = str(line)
for word in specials:
new_line = str.replace(new_line, word, '')
writer.writerow(new_line.split(','))