在Pandas中使用read_csv导入csv文件时排除最后两行

时间:2018-01-30 05:28:54

标签: pandas dataframe python-import

下午全部,

我将数据从SQL服务器提取为csv格式,然后在中读取文件。

df = pd.read_csv(
                            'TKY_RFQs.csv', 
                            sep='~', 
                            usecols=[
                                     0,1,2,3,4,5,6,7,8,9,
                                     10,11,12,13,14,15,16,17,18,19,
                                     20,21,22,23,24,25,26,27,28,29,
                                     30,31,32,33,34,35,36,37
                                    ]
                )

在我要删除的文件末尾有一个空行然后是记录计数。

End of file screenshot

我通过此代码解决了这个问题,但想解决根问题:

# Count_Row=df.shape[0] # gives number of row count
# df_Sample = df[['trading_book','state', 'rfq_num_of_dealers']].head(Count_Row-1)

有没有办法排除文件中的最后两行,或者可以选择删除所有列中包含空值的行?

皮特

2 个答案:

答案 0 :(得分:0)

你可以尝试一下:

df = pd.read_csv(
                            'TKY_RFQs.csv', 
                            sep='~', 
                            usecols=[
                                     0,1,2,3,4,5,6,7,8,9,
                                     10,11,12,13,14,15,16,17,18,19,
                                     20,21,22,23,24,25,26,27,28,29,
                                     30,31,32,33,34,35,36,37
                                    ]
                )[:-2]

示例:

from pandas import read_csv
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes/pima-indians-diabetes.data"
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
data = read_csv(url, names=names)[:-2] #to exclude last two rows
#data = read_csv(url, names=names) #to include all rows
print data
#description = data.describe()

答案 1 :(得分:0)

您可以直接在.read_csv中使用skiprows

df = pd.read_csv(
                            'TKY_RFQs.csv', 
                            sep='~', 
                            usecols=[
                                     0,1,2,3,4,5,6,7,8,9,
                                     10,11,12,13,14,15,16,17,18,19,
                                     20,21,22,23,24,25,26,27,28,29,
                                     30,31,32,33,34,35,36,37
                                    ],
                            skiprows=-2 # added this line to skip rows when reading
                )