Question

我有一个数据框，我试图从一些元素的前面删除一个短语：

扇区

工业工程
富时所有股份行业房地产
FTSE ST All-Share
富时所有股份行业运输
FTSE All-Share Sector固定电话
FTSE All-Share Sector Software＆amp;计算机服务

因此当元素以'FTSE All-Share Sector'开头时，我希望删除该短语

扇区

工业工程
房地产
FTSE ST All-Share
工业运输
固定电话
软件＆amp;计算机服务

我试过了

df.Sector = df.Sector.map(lambda x: x.lstrip('FTSE All-Share Sector '))

在某些情况下有效，但在其他情况下无效

列表项
工业工程
房地产
{空白}即删除所有内容
工业运输
ixed Line Telecommunications
ftware＆amp;计算机服务

所以我猜它正在'FTSE All-Share Sector'中的每个角色而不是单词

我也试过

df.Sector.replace (["FTSE All-Share Sector "],[""])

运行但没有明显效果

和

if df.Sector.str.startswith('FTSE All-Share Sector '):
    df.Sector = df.Sector[-24:]

生成以下错误

Traceback (most recent call last):
File "C:\Users\Alan\Downloads\eclipse-standard-kepler-SR1-win32-x86_64\eclipse\plugins\org.python.pydev_3.2.1.201401262345\pysrc\pydev_runfiles.py", line 466, in __get_module_from_str
mod = __import__(modname)
File "C:/Users/Alan/workspace/Data analysis/Tests\import.py", line 57, in <module>
if df.Sector.str.startswith('FTSE All-Share Sector '):
File "C:\Python27\lib\site-packages\pandas\core\generic.py", line 665, in __nonzero__
.format(self.__class__.__name__))
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
ERROR: Module: import could not be imported (file: C:\Users\Alan\workspace\Data analysis\Tests\import.py).

提前致谢，希望能有一个简单的解决方法！

Answer 1

您可以使用str.replace：

df.Sector = df.Sector.str.replace ("FTSE All-Share Sector ", "")

从pandas数据框中的某些元素中删除短语

1 个答案: