Question

我正在尝试从csv文件中的字符串中删除一些子字符串。

   import csv
   import string

   input_file = open('in.csv', 'r')
   output_file = open('out.csv', 'w')
   data = csv.reader(input_file)
   writer = csv.writer(output_file,quoting=csv.QUOTE_ALL)# dialect='excel')
   specials = ("i'm", "hello", "bye")

   for line in data:
     line = str(line)
     new_line = str.replace(line,specials,'')
     writer.writerow(new_line.split(','))

    input_file.close()
    output_file.close()

所以对于这个例子：

 hello. I'm obviously over the moon. If I am being honest I didn't think I'd get picked, so to get picked is obviously a big thing.  bye.

我希望输出为：

obviously over the moon. If I am being honest I didn't think I'd get picked, so to get picked is obviously a big thing.

但这只适用于搜索单个单词时。所以特价=“我是”例如。我是否需要将我的单词添加到列表或数组中？

Answer 1

您似乎已经通过csv.reader拆分了输入，但是通过将拆分线重新转换为字符串，您就可以抛弃所有的善意。最好不要这样做，而是继续使用从csv阅读器中获得的列表。所以，它变成了这样的东西：

for row in data:
    new_row = []  # A place to hold the processed row data.

    # look at each field in the row.
    for field in row:

        # remove all the special words.
        new_field = field
        for s in specials:
            new_field = new_field.replace(s, '')

        # add the sanitized field to the new "processed" row.
        new_row.append(new_field)

    # after all fields are processed, write it with the csv writer.
    writer.writerow(new_row)

Answer 2

看起来你并没有通过特殊事物进行迭代，因为它是一个元组而不是一个列表，所以它只抓取其中一个值。试试这个：

specials = ["i'm, "hello", "bye"]

for line in data:
     new_line = str(line)
         for word in specials:
              new_line = str.replace(new_line, word, '')
     writer.writerow(new_line.split(','))

Python从字符串中删除子串

2 个答案: