使用python编辑csv列

时间:2017-11-09 14:24:28

标签: python regex python-2.7 csv

我想将csv文件中的第2列全部小写并删除所有标点并保存文件。我怎么能这样做?

import re
import csv

with open('Cold.csv', 'rb') as f_input1:
    with open('outing.csv', 'wb') as f_output:

      reader = csv.reader(f_input1)
      writer = csv.writer(f_output)

for row in reader:
      row[1] = re.sub('[^a-z0-9]+', ' ', str(row[1].lower()))
      writer.writerow(row)
f_input1.close()

我如何添加:

re.sub('[^A-Za-z0-9]+', ' ', str(row))
filewriter.writerow([new_row.lower()]) 

或.lower在这段代码中?

3 个答案:

答案 0 :(得分:2)

您可以添加代码来修改您的单元格,如下所示:

import re
import csv

with open('in.csv', 'rb') as f_input, open('out.csv', 'wb') as f_output:
    csv_output = csv.writer(f_output)

    for row in csv.reader(f_input):
        row[1] = re.sub('[^A-Za-z0-9]+', '', row[1].lower())
        csv_output.writerow(row)

.lower()用于首先将字符串转换为小写。使用with可确保您的文件最终自动关闭。

注意,您的正则表达式sub应该用空字符串替换任何无效字符,例如'',您目前将其设置为单个空格。

答案 1 :(得分:1)

只需编辑适当的行并将其写回

with open('Cold.csv', 'rb') as f_input1, open('outing.csv', 'wb') as f_output:

    reader = csv.reader(f_input1)
    writer = csv.writer(f_output)

    for row in reader:  
        row[1] = re.sub('[^a-z0-9]+', ' ', str(row[1].lower()))
        writer.writerow(row)

答案 2 :(得分:-1)

最简单的解决方案是将这两种方法结合在一起。

import string
s.translate(None, string.punctuation) --> to remove punctuation

##if speed is not an issue
exclude = set(string.punctuation)
s = ''.join(ch for ch in s if ch not in exclude)


row.lower() --> to convert to lower case