Pandas to_datetime不适用于空值

时间:2018-04-10 14:47:57

标签: python pandas dataframe string-to-datetime

是的,我会尽可能地清楚。

这是我的数据框base_varlist2

completion_date_latest completion_date_original customer_birth_date_1  \
0              07/10/2004               17/05/1996            04/02/1963   
1              16/02/2004               16/02/2004            31/10/1968   
2              25/03/2004               25/03/2004            18/09/1960   
3              10/02/2004               10/02/2004            18/04/1972   
4              03/08/2010               25/05/2004            12/09/1960   
5              16/04/2004               16/04/2004            27/08/1975   
6              05/04/2004               05/04/2004            02/02/1971   
7              26/03/2004               26/03/2004            05/05/1959   
8              29/07/2004               29/07/2004            10/10/1960   
9              14/06/2004               14/06/2004            29/07/1962   
10             16/09/2004               16/09/2004            03/11/1969   
11             07/07/2004               07/07/2004            01/05/1972   
12             10/05/2004               10/05/2004            04/12/1952   
13             14/07/2004               14/07/2004            20/08/1967   
14             04/08/2006               04/08/2006            22/05/1973   
15             10/08/2004               10/08/2004            05/01/1964   
16             23/09/2004               23/09/2004            30/07/1970   
17             12/03/2010               17/09/2004            15/01/1964   
18             16/10/2006               27/09/2004            06/01/1947   
19             10/08/2006               10/08/2006            26/03/1964   
20             03/08/2004               03/08/2004            31/10/1971   
21             06/10/2004               06/10/2004            18/03/1958   
22             13/12/2005               13/10/2004            23/01/1986   
23             31/08/2004               31/08/2004            22/01/1959   
24             08/03/2007               09/03/2007            20/09/1974   

   customer_birth_date_2    d_start latest_maturity_date  \
0                    NaN  01-Feb-18           01/03/2027   
1                    NaN  01-Feb-18           16/04/2021   
2                    NaN  01-Feb-18           16/03/2024   
3             03/01/1972  01-Feb-18           16/02/2029   
4                    NaN  01-Feb-18           16/07/2026   
5             23/05/1975  01-Feb-18           16/04/2027   
6             22/11/1972  01-Feb-18           16/04/2029   
7             08/10/1959  01-Feb-18           16/03/2016   
8             14/09/1961  01-Feb-18           16/07/2024   
9                    NaN  01-Feb-18           16/07/2020   
10            23/01/1966  01-Feb-18           16/02/2034   
11                   NaN  01-Feb-18           16/07/2029   
12            06/08/1961  01-Feb-18           16/05/2018   
13                   NaN  01-Feb-18           16/07/2029   
14                   NaN  01-Feb-18           16/08/2026   
15            16/09/1966  01-Feb-18           16/08/2029   
16            19/07/1968  01-Feb-18           16/06/2026   
17            18/08/1969  01-Feb-18           16/10/2022   
18            30/07/1957  01-Feb-18           16/09/2021   
19                   NaN  01-Feb-18           16/08/2028   
20            15/10/1964  01-Feb-18           16/08/2029   
21            09/02/1959  01-Feb-18           16/10/2022   
22                   NaN  01-Feb-18           16/01/2037   
23                   NaN  01-Feb-18           16/08/2023   
24                   NaN  01-Feb-18           01/03/2027   

   latest_valuation_date  sdate startdt_def  
0             08/05/2004    NaN         NaN  
1             17/01/2004    NaN         NaN  
2             02/01/2004    NaN         NaN  
3             30/12/2003    NaN         NaN  
4             14/06/2010    NaN         NaN  
5             16/03/2004    NaN         NaN  
6             17/02/2004    NaN         NaN  
7             02/03/2004    NaN   01-Sep-16  
8             19/05/2004    NaN         NaN  
9             10/05/2004    NaN         NaN  
10            01/07/2004    NaN         NaN  
11            05/02/2004    NaN         NaN  
12            07/04/2004    NaN         NaN  
13            22/04/2004    NaN         NaN  
14            26/04/2006    NaN         NaN  
15            05/05/2004    NaN         NaN  
16            21/05/2004    NaN         NaN  
17            18/02/2010    NaN         NaN  
18            25/09/2006    NaN         NaN  
19            26/06/2006    NaN         NaN  
20            07/07/2004    NaN         NaN  
21            07/09/2004    NaN         NaN  
22            29/09/2005    NaN   01-Dec-07  
23            02/04/2004    NaN         NaN  
24            30/01/2007    NaN         NaN  

varlist2是我的数据框中所有列的列表:

In [123]: varlist2
Out[123]: 
['completion_date_latest',
 'completion_date_original',
 'customer_birth_date_1',
 'customer_birth_date_2',
 'd_start',
 'latest_maturity_date',
 'latest_valuation_date',
 'sdate',
 'startdt_def']

我想将所有日期转换为日期时间。当然,其中一些是NaN

这是我尝试过的:

mm_dates_base = base_varlist2.copy()

for l in range (0,len(varlist2)):
    date_var = varlist2[l]
    print('MM_Dates transform variable: ' + date_var)
    mm_dates_base[date_var] = mm_dates_base[date_var].fillna('')
    mm_dates_base[date_var] = pd.to_datetime(mm_dates_base[date_var], errors='coerce', dayfirst=True)

所以我遍历列表中的每个元素并将缺失值更改为空白。然后我将mm_dates中的日期更改为datetime。这可以游泳,直到它失去价值。运行for循环时没有错误,但是当我运行它时:

print(mm_dates_base.iloc[0])

我得到一个ValueError:

ValueError: cannot convert float NaN to integer

这很奇怪。我甚至有error='coerce' ......有谁知道可能会出错?

1 个答案:

答案 0 :(得分:0)

将for循环更改为errors='ignore'

for l in range (0,len(varlist2)):
    date_var = varlist2[l]
    print('MM_Dates transform variable: ' + date_var)

    mm_dates_base[date_var] = pd.to_datetime(mm_dates_base[date_var], errors='ignore', dayfirst=True)
    mm_dates_base[date_var] = mm_dates_base[date_var].fillna('')