删除带有单元格的行,检查其字符串或int

时间:2018-05-25 03:29:14

标签: python pandas

我有500k行的数据,整个数据的格式有点不一致 我使用Spyder,pandas来清理数据

我将有一个由数字或字符串组成的列。 如果特定单元格在字符串

中,我想删除整行

如下所示,我的代码由于机密信息而进行了一些调整

import pandas as pd
import csv
mydataset = pd.read_csv('test.txt', error_bad_lines=False,
                    engine='python',
                    index_col=False,header = None,quoting=csv.QUOTE_NONE,  
                    sep="[\s|,|/]",names=["1","2","3","4","a","b","c",
                    "h","i","j","k","l","m","n","o","p","f","g",
                    "q","r","s","t","u","v","w","x","y","z",
                    "5","6","7","8","9","10","11","12","13","14"])

print (mydataset.shape)

columns =['3','4','h','a','b','c','i','j','k','l','m','n','f','g']
mydataset.drop(columns,inplace=True,axis=1)
print (mydataset.shape)

mydataset = mydataset[(mydataset.q.notnull())&(mydataset.r.notnull())& 
(mydataset.s.notnull())&(mydataset.2.notnull())&(mydataset.2 != "@")]

请原谅标题的命名惯例。

example of data:
1    2    3    4   <--header
abc  123  123  bcd <--Data
123  123  123  bcd <--Data

想要检测&#34; abc&#34;并删除整行

请指教!

1 个答案:

答案 0 :(得分:-1)

使用dataframe.map,它可能如下(我不确定所有语法是否正确):

def remove(row):
     if 'abc' in row:
          row = []
mydataset.map(remove)