在python中的OBJECT中用字符串删除一行

时间:2018-03-18 16:49:50

标签: python pandas

我有以下数据并尝试删除dateTime列中"!ENDMSG!"所有的行

                   dateTime  tradePrice  tradeVolume  aggVolume     bid1     ask1  quote_counter
33501  2017-09-19 15:59:53     12545.5          1.0    54344.0  12545.0  12545.5      1567101.0
33502  2017-09-19 15:59:59     12545.0          1.0    54345.0  12545.0  12545.5      1567136.0
33503  2017-09-19 16:00:00     12545.0          1.0    54346.0  12544.5  12545.0      1567146.0
33504  2017-09-19 16:03:44     12544.0         17.0    54363.0  12544.0  12544.0      1567519.0
33505             !ENDMSG!         NaN          NaN        NaN      NaN      NaN            NaN

问题是dateTime列的类型为object,因此 ALL 以下选项失败:

df[df.dateTime.str.contains('!ENDMSDG!') == False]
df[~df.dateTime.str.contains("!")]
df = df[~df['dateTime'].str.contains('!ENDMSDG!')]
print(df[df['dateTime'].str.match('!ENDMSDG!')])#another try to catch it

感谢您的建议!

1 个答案:

答案 0 :(得分:1)

似乎需要将!ENDMSDG!更改为!ENDMSG!(而不是第二D封信件):

df = df[~df['dateTime'].str.contains('!ENDMSG!')]

df = df[~df['dateTime'].str.contains('!')]
#alternative with disable regex
#df = df[~df['dateTime'].str.contains('!', regex=False)]

如果需要检查字符串的开头:

df = df[~df['dateTime'].str.startswith('!')]

print (df)
                  dateTime  tradePrice  tradeVolume  aggVolume     bid1  \
33501  2017-09-19 15:59:53     12545.5          1.0    54344.0  12545.0   
33502  2017-09-19 15:59:59     12545.0          1.0    54345.0  12545.0   
33503  2017-09-19 16:00:00     12545.0          1.0    54346.0  12544.5   
33504  2017-09-19 16:03:44     12544.0         17.0    54363.0  12544.0   

          ask1  quote_counter  
33501  12545.5      1567101.0  
33502  12545.5      1567136.0  
33503  12545.0      1567146.0  
33504  12544.0      1567519.0