我必须清理CSV文件数据。我想要清理的数据如下。 条件:我必须在缺少它的每个字符串的末尾添加@ myclinic.com.au。
douglas@myclinic.com.au
mildura
broadford@myclinic.com.au
officer@myclinic.com.au
nowa nowa@myclinic.com.au
langsborough@myclinic.com.au
brisbane@myclinic.com.au
robertson@myclinic.com.au
logan village
ipswich@myclinic.com.au
这个代码是
DataFrame = pandas.read_csv(ClinicCSVFile)
DataFrame['Email'] = DataFrame['Email'].apply(lambda x: x if '@' in str(x) else str(x)+'@myclinic.com.au')
DataFrameToCSV = DataFrame.to_csv('Temporary.csv', index = False)
print(DataFrameToCSV)
但是我得到的输出是none,我无法处理问题的后半部分,因为它产生了下面的错误
TypeError: 'NoneType' object is not iterable
由上述数据框引起。 请帮帮我。
答案 0 :(得分:1)
使用endswith
作为条件,使用~
反转并将字符串添加到结尾:
df.loc[~df['Email'].str.endswith('@myclinic.com.au'), 'Email'] += '@myclinic.com.au'
#if need check only @
#df.loc[~df['Email'].str.contains('@'), 'Email'] += '@myclinic.com.au'
print (df)
Email
0 douglas@myclinic.com.au
1 mildura@myclinic.com.au
2 broadford@myclinic.com.au
3 officer@myclinic.com.au
4 nowa nowa@myclinic.com.au
5 langsborough@myclinic.com.au
6 brisbane@myclinic.com.au
7 robertson@myclinic.com.au
8 logan village@myclinic.com.au
9 ipswich@myclinic.com.au
对我来说它很好用:
df = pd.DataFrame({'Email': ['douglas@myclinic.com.au', 'mildura', 'broadford@myclinic.com.au', 'officer@myclinic.com.au', 'nowa nowa@myclinic.com.au', 'langsborough@myclinic.com.au', 'brisbane@myclinic.com.au', 'robertson@myclinic.com.au', 'logan village', 'ipswich@myclinic.com.au']})
df.loc[~df['Email'].str.contains('@'), 'Email'] += '@myclinic.com.au'
print (df)
Email
0 douglas@myclinic.com.au
1 mildura@myclinic.com.au
2 broadford@myclinic.com.au
3 officer@myclinic.com.au
4 nowa nowa@myclinic.com.au
5 langsborough@myclinic.com.au
6 brisbane@myclinic.com.au
7 robertson@myclinic.com.au
8 logan village@myclinic.com.au
9 ipswich@myclinic.com.au
答案 1 :(得分:0)
使用sub(".*\\|", "", str1)
#[1] "g__Woeseia"
和str1 <- "d__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Chromatiales|f__Woeseiaceae|g__Woeseia"
<强>实施例强>
apply
<强>输出:强>
endswith