我的数据看起来像这样:
GIdx,Date,num,Time
1,11/28/2012,20,10:05:50
1,11/28/2012,20,10:05:50
2,11/28/2012,20,10:09:24
2,11/28/2012,20,10:09:24
2,11/28/2012,20,10:09:25
2,11/28/2012,20,10:09:25
2,11/28/2012,20,10:09:26
3,11/28/2012,20,10:09:34
3,11/28/2012,20,10:09:34
我尝试将列日期读为datetime
,将列时间读为time
但当我检查他们的类型时,我得到Series
:
type(df['Date'])
class pandas.core.series.Series
type(df_original['Time'])
class pandas.core.series.Series
我做了类似的事情:
df=pd.read_csv(filename,sep=",", header = 0, na_values=['NA'])
答案 0 :(得分:0)
您可以使用parse_dates
和dates
的列添加read_csv
参数times
:
import pandas as pd
import io
temp=u"""GIdx,Date,num,Time
1,11/28/2012,20,10:05:50
1,11/28/2012,20,10:05:50
2,11/28/2012,20,10:09:24
2,11/28/2012,20,10:09:24
2,11/28/2012,20,10:09:25
2,11/28/2012,20,10:09:25
2,11/28/2012,20,10:09:26
3,11/28/2012,20,10:09:34
3,11/28/2012,20,10:09:34"""
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), parse_dates=[['Date','Time']])
print (df)
Date_Time GIdx num
0 2012-11-28 10:05:50 1 20
1 2012-11-28 10:05:50 1 20
2 2012-11-28 10:09:24 2 20
3 2012-11-28 10:09:24 2 20
4 2012-11-28 10:09:25 2 20
5 2012-11-28 10:09:25 2 20
6 2012-11-28 10:09:26 2 20
7 2012-11-28 10:09:34 3 20
8 2012-11-28 10:09:34 3 20
print (df.dtypes)
Date_Time datetime64[ns]
GIdx int64
num int64
dtype: object
您可以省略参数sep=","
,header = 0
和na_values=['NA']
,因为默认情况下这样做:
df=pd.read_csv(filename,sep=",", header = 0, na_values=['NA'])
df=pd.read_csv(filename)