我有一个包含以下内容的文件
first_name,last_name,uid,email,dep_code,dep_name
john,smith,jsmith,jsmith@gmail.com,finance,21230
john,king,jking,jjing@gmail.com,human resource,31230
我想复制“email”列并创建一个新列“email2”,然后将gmail.com从列email2替换为hotmail.com
我是python的新手,所以需要专家的帮助,我尝试了很少的脚本,但是如果有更好的方法,请告诉我。原始文件包含60000行。
with open('c:\\Python27\\scripts\\colnewfile.csv', 'rb') as fp_in1, open('c:\\Python27\\scripts\\final.csv', 'wb') as fp_out1:
writer1 = csv.writer(fp_out1, delimiter=",")
reader1 = csv.reader(fp_in1, delimiter=",")
domain = "@hotmail.com"
for row in reader1:
if row[2:3] == "uid":
writer1.append("Email2")
else:
writer1.writerow(row+[row[2:3]])
这是最终的脚本,唯一的问题是它没有完成整个outfile,它只显示61409行,而在输入文件中有61438行。
inFile ='c:\ Python27 \ scripts \ in-093013.csv' outFile ='c:\ Python27 \ scripts \ final.csv'
打开(inFile,'rb')为fp_in1,打开(outFile,'wb')为fp_out1: writer = csv.writer(fp_out1,delimiter =“,”) reader = csv.reader(fp_in1,delimiter =“,”) 对于读者来说: del col [6:] writer.writerow(COL) headers = next(读者) writer.writerow(headers + ['email2']) 读者行: 如果len(行)> 3: email = email.split('@',1)[0] +'@ hotmail.com' writer.writerow(row + [email])
答案 0 :(得分:1)
如果您在阅读器上拨打next()
,您可以同时获得一行;用它来复制标题。复制电子邮件列非常简单:
import csv
infilename = r'c:\Python27\scripts\colnewfile.csv'
outfilename = r'c:\Python27\scripts\final.csv'
with open(infilename, 'rb') as fp_in, open(outfilename, 'wb') as fp_out:
reader = csv.reader(fp_in, delimiter=",")
headers = next(reader) # read first row
writer = csv.writer(fp_out, delimiter=",")
writer.writerow(headers + ['email2'])
for row in reader:
if len(row) > 3:
# make sure there are at least 4 columns
email = row[3].split('@', 1)[0] + '@hotmail.com'
writer.writerow(row + [email])
此代码在第一个@
符号上拆分电子邮件地址,获取拆分的第一部分并在其后添加@hotmail.com
:
>>> 'example@gmail.com'.split('@', 1)[0]
'example'
>>> 'example@gmail.com'.split('@', 1)[0] + '@hotmail.com'
'example@hotmail.com'
以上产生:
first_name,last_name,uid,email,dep_code,dep_name,email2
john,smith,jsmith,jsmith@gmail.com,finance,21230,jsmith@hotmail.com
john,king,jking,jjing@gmail.com,human resource,31230,jjing@hotmail.com
您的样本输入。
答案 1 :(得分:0)
这可以使用pandas非常干净地完成。在这里:
In [1]: import pandas as pd
In [3]: df = pd.read_csv('your_csv_file.csv')
In [4]: def rename_email(row):
...: return row.email.replace('gmail.com', 'hotmail.com')
...:
In [5]: df['email2'] = df.apply(rename_email, axis=1)
In [6]: """axis = 1 or ‘columns’: apply function to each row"""
In [7]: df
Out[7]:
first_name last_name uid email dep_code dep_name email2
0 john smith jsmith jsmith@gmail.com finance 21230 jsmith@hotmail.com
1 john king jking jjing@gmail.com human resource 31230 jjing@hotmail.com
In [8]: df.to_csv('new_update_email_file.csv')