我无法使用以下pandas commnads加载csv。
f1 = pd.read_csv(r'C:\Users\sana.mohan.reddy\Desktop\Python_Practice\Test1.CSV', skiprows=[0,1,2], skip_footer=[0], sep = ',')
我必须跳过前3行和最后一行。
以下是示例数据。
联系人 - 按广告系列打开的总数
Email Open Date/Time,"Total Opens"
3/25/2016 6:00:35 AM,"1"
3/25/2016 6:00:35 AM,"1"
3/25/2016 6:00:46 AM,"1"
3/25/2016 6:00:46 AM,"1"
3/25/2016 6:00:51 AM,"1"
3/25/2016 6:00:52 AM,"1"
Total,"796"
你可以在我出错的地方纠正我吗
答案 0 :(得分:1)
我认为您可以将Link to download full code与其他参数一起使用(sep = ','
省略,因为,
的默认值为sep
):
import pandas as pd
import io
temp=u'''Email Open Date/Time,"Total Opens"
3/25/2016 6:00:35 AM,"1"
3/25/2016 6:00:35 AM,"1"
3/25/2016 6:00:46 AM,"1"
3/25/2016 6:00:46 AM,"1"
3/25/2016 6:00:51 AM,"1"
3/25/2016 6:00:52 AM,"1"
Total,"796"'''
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp),
skipfooter=1, #skip last row
engine='python', #remove warning
skiprows=[0,1,2], #remove first 3 rows
header=None) #no header, set default 0,1,...
print (df)
0 1
0 3/25/2016 6:00:46 AM 1
1 3/25/2016 6:00:46 AM 1
2 3/25/2016 6:00:51 AM 1
3 3/25/2016 6:00:52 AM 1
按实际数据编辑:
编码存在主要问题 - 我必须设置utf-16
。
import pandas as pd
df = pd.read_csv('Test 1.csv',
skipfooter=1, #skip last row
engine='python', #remove warning
skiprows=[0,1], #remove first 2 rows
encoding='utf-16', #set encoding
parse_dates=[0]) #convert first column to datetime
print (df)
Email Open Date/Time Total Opens
0 2016-03-25 06:00:35 1
1 2016-03-25 06:00:35 1
2 2016-03-25 06:00:46 1
3 2016-03-25 06:00:46 1
4 2016-03-25 06:00:51 1
5 2016-03-25 06:00:52 1
6 2016-03-25 06:00:57 1
7 2016-03-25 06:00:58 1
8 2016-03-25 06:01:03 1
9 2016-03-25 06:01:20 1
10 2016-03-25 06:01:20 1
11 2016-03-25 06:01:25 1
答案 1 :(得分:0)
您需要将read_csv
更正为:
f1 = pd.read_csv('yourFile.csv', skiprows=3, skip_footer=1, sep = ',')
由于skip_footer
需要一个整数值(要在文件底部跳过的行数),请参阅http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html