强制使用小时和分钟的日期时间为null pandas

时间:2015-07-30 11:55:11

标签: python datetime pandas

我想减少以下数据的小时/分钟,只保留'YYYY-MM-DD 00:00:00'。

这是一个比这个最短的方法(我希望得到一个datetime [ns])作为结果,为什么np.array()强制一个时区...?

In[229]: index = pd.date_range('2015-01-01', freq = 'H', periods=10)
In[230]: df = pd.DataFrame(index = range(len(index)), data=index)

In[231]: df
Out[230]: 
                    0
0 2015-01-01 00:00:00
1 2015-01-01 01:00:00
2 2015-01-01 02:00:00
3 2015-01-01 03:00:00
4 2015-01-01 04:00:00
5 2015-01-01 05:00:00
6 2015-01-01 06:00:00
7 2015-01-01 07:00:00
8 2015-01-01 08:00:00
9 2015-01-01 09:00:00

In[236]:np.array(pd.to_datetime(pd.Index(index).date))
Out[236]: 
array(['2015-01-01T01:00:00.000000000+0100',
       '2015-01-01T01:00:00.000000000+0100',
       '2015-01-01T01:00:00.000000000+0100',
       '2015-01-01T01:00:00.000000000+0100',
       '2015-01-01T01:00:00.000000000+0100',
       '2015-01-01T01:00:00.000000000+0100',
       '2015-01-01T01:00:00.000000000+0100',
       '2015-01-01T01:00:00.000000000+0100',
       '2015-01-01T01:00:00.000000000+0100',
       '2015-01-01T01:00:00.000000000+0100'], dtype='datetime64[ns]')

2 个答案:

答案 0 :(得分:1)

只需访问.date属性:

In [88]:
index = pd.date_range('2015-01-01', freq = 'H', periods=10).date
df = pd.DataFrame(index = range(len(index)), data=index)
df

Out[88]:
            0
0  2015-01-01
1  2015-01-01
2  2015-01-01
3  2015-01-01
4  2015-01-01
5  2015-01-01
6  2015-01-01
7  2015-01-01
8  2015-01-01
9  2015-01-01

修改

如果dtype为datetime64,那么您可以再次访问date属性来更改它:

In [97]:

df[0] = df[0].dt.date
df
Out[97]:
            0
0  2015-01-01
1  2015-01-01
2  2015-01-01
3  2015-01-01
4  2015-01-01
5  2015-01-01
6  2015-01-01
7  2015-01-01
8  2015-01-01
9  2015-01-01

答案 1 :(得分:1)

如果它已经加载,你可以使用Dim input As String() 'get it filled from your source Dim array As String(,) ReDim array(10, 5) For i As Integer = 0 To 10 Dim split As String() = input(i).Split(",") For j As Integer = 0 To 10 array(i, j) = split(j) If (j >= split.Length - 1) Then j = Integer.MaxValue End If Next Next 功能,这就是它的用途。

pd.datetools.normalize_date

请注意,您还可以在索引上调用normalize:In [1]: index = pd.date_range('2015-01-01', freq = 'H', periods=10) df = pd.DataFrame(index = range(len(index)), data=index, columns=['Date']) df Out[1]: Date 0 2015-01-01 00:00:00 1 2015-01-01 01:00:00 2 2015-01-01 02:00:00 3 2015-01-01 03:00:00 4 2015-01-01 04:00:00 5 2015-01-01 05:00:00 6 2015-01-01 06:00:00 7 2015-01-01 07:00:00 8 2015-01-01 08:00:00 9 2015-01-01 09:00:00 In [142]: df['Date'] = df['Date'].apply(pd.datetools.normalize_date) df Out[142]: Date 0 2015-01-01 1 2015-01-01 2 2015-01-01 3 2015-01-01 4 2015-01-01 5 2015-01-01 6 2015-01-01 7 2015-01-01 8 2015-01-01 9 2015-01-01