我尝试删除每行不需要的字符/ #http格式 代码如下:
import csv
with open('C:\\project\\in.csv','r') as input_file:
with open('C:\\project\\out.csv','w') as output_file:
for L in input_file:
if L.endswith("/"):
newL=L.replace("/","")
output_file.write(newL)
elif L.find("#"):
newL,sep,tail=L.partition("#")
output_file.write(newL)
elif L.startswith('http:'):
newL=L.replace('http:','https:')
output_file.write(newL)
这是用于测试的in.csv文件的迷你示例:
line1/
line2#sdgsgs
https://line3
http://line4
line5/
干净之后,我希望它像:
line1
line2
https://line3
https://line4
line5
但结果不是我想要的,有人可以帮我一把。
非常感谢,亨利
答案 0 :(得分:1)
在此版本中,一行可以包含所有替换字符:
#!/usr/bin/env python
import csv
Output = []
with open('C:\\project\\in.csv', 'r') as input_file:
for line in input_file:
line = line.strip()
if line.endswith("/"):
line = line.replace("/", "")
if "#" in line:
line, sep, tail = line.partition("#")
if line.startswith('http:'):
line = line.replace('http:', 'https:')
Output.append(line)
with open('C:\\project\\out.csv', 'w') as output_file:
for output in Output:
output_file.write("{}\n".format(output))
将输出:
line1
line2
https://line3
https://line4
line5