我要删除任何行,包括数据框中的特定字符串。
我要删除电子邮件地址异常(带有.jpg
)的数据行
这是我的代码,怎么了?
df = pd.DataFrame({'email':['abc@gmail.com', 'cde@gmail.com', 'ghe@ss.jpg', 'sldkslk@sss.com']})
df
email
0 abc@gmail.com
1 cde@gmail.com
2 ghe@ss.jpg
3 sldkslk@sss.com
for i, r in df.iterrows():
if df.loc[i,'email'][-3:] == 'com':
df.drop(df.index[i], inplace=True)
Traceback (most recent call last):
File "<ipython-input-84-4f12d22e5e4c>", line 2, in <module>
if df.loc[i,'email'][-3:] == 'com':
File "C:\Anaconda\lib\site-packages\pandas\core\indexing.py", line 1472, in __getitem__
return self._getitem_tuple(key)
File "C:\Anaconda\lib\site-packages\pandas\core\indexing.py", line 870, in _getitem_tuple
return self._getitem_lowerdim(tup)
File "C:\Anaconda\lib\site-packages\pandas\core\indexing.py", line 998, in _getitem_lowerdim
section = self._getitem_axis(key, axis=i)
File "C:\Anaconda\lib\site-packages\pandas\core\indexing.py", line 1911, in _getitem_axis
self._validate_key(key, axis)
File "C:\Anaconda\lib\site-packages\pandas\core\indexing.py", line 1798, in _validate_key
error()
File "C:\Anaconda\lib\site-packages\pandas\core\indexing.py", line 1785, in error
axis=self.obj._get_axis_name(axis)))
KeyError: 'the label [2] is not in the [index]'
答案 0 :(得分:1)
IIUC,您可以执行此操作,而不用iterrows
遍历框架:
df = df[df.email.str.endswith('.com')]
返回:
>>> df
email
0 abc@gmail.com
1 cde@gmail.com
3 sldkslk@sss.com
或者,对于较大的数据框,有时不使用str
提供的pandas
方法会更快,而只是使用python内置的字符串方法以纯列表理解的方式做到这一点:
df = df[[i.endswith('.com') for i in df.email]]