如何只在同一列中转换NaN,它也有实际的日期和时间?从Python的日期范围生成随机日期和时间?

时间:2017-10-23 22:58:55

标签: python-3.x pandas numpy dataframe pycharm

在我的数据框中,有一个特定的列,其中包含日期和时间。在同一列中是NaN(空白行),我想将这些空行仅转换为列中已有的日期和时间,而不是实际日期。

有没有办法让python从日期范围中随机选择?我希望这个日期范围是从02/01/2011 - 02/01/2013(或尽可能接近)和随机时间。

由于上述问题得到解答(下面的代码),出现了一个新问题,我已粘贴在初始问题的代码之下;

dates = pd.date_range('02/01/2011', periods=len(file1), freq='D').values
np.random.shuffle(dates)

file1.loan_verifieddate = file1.loan_verifieddate.fillna(pd.Series(dates, index=file1.index))

我也使用了以下内容;

dates = pd.date_range('02/01/2011', '02/01/2013', freq='D')

file1['loan_verifieddate'] = pd.to_datetime(file1['loan_verifieddate'])

file1['loan_verifieddate'].fillna(pd.Series(np.random.choice(dates, size=len(file1))), inplace = True)

以下是使用上述代码时出现的错误;

C:\Users\ishore\AppData\Roaming\Python\Python36\site-packages\pandas\core\indexes\datetimes.py:1973: RuntimeWarning: overflow 

encountered in longlong_scalars
  e = b + np.int64(periods) * stride
Traceback (most recent call last):
  File "C:/Users/ishore/Documents/Custom Office Templates/Pycharm Projects/Task1.py", line 18, in <module>
    file1.loan_verifieddate = file1.loan_verifieddate.fillna(pd.Series(dates, index=file1.index))
  File "C:\Users\ishore\AppData\Roaming\Python\Python36\site-packages\pandas\core\series.py", line 250, in __init__
    data = SingleBlockManager(data, index, fastpath=True)
  File "C:\Users\ishore\AppData\Roaming\Python\Python36\site-packages\pandas\core\internals.py", line 4117, in __init__
    fastpath=True)
  File "C:\Users\ishore\AppData\Roaming\Python\Python36\site-packages\pandas\core\internals.py", line 2719, in make_block
    return klass(values, ndim=ndim, fastpath=fastpath, placement=placement)
  File "C:\Users\ishore\AppData\Roaming\Python\Python36\site-packages\pandas\core\internals.py", line 2224, in __init__
    placement=placement, **kwargs)
  File "C:\Users\ishore\AppData\Roaming\Python\Python36\site-packages\pandas\core\internals.py", line 115, in __init__
    len(self.mgr_locs)))
ValueError: Wrong number of items passed 0, placement implies 210229

以下是我在上面引用的专栏;

               loan_verifieddate  
0                            NaN      
1               02/01/2011 10:55        
2               02/01/2011 10:55     
3                            NaN       
4                            NaN         
5               02/01/2011 08:38           
6               02/01/2011 08:38 
7               02/01/2011 08:38 
8                            NaN              
9                            NaN             
10              02/01/2011 08:38              
11              02/01/2011 08:38              
12                           NaN              
13                           NaN               
14                           NaN              
15                           NaN              
16                           NaN               
17                           NaN              
18                           NaN            
19                           NaN              
20                           NaN            
21                           NaN             
22                           NaN           
23                           NaN           
24                           NaN             
25                           NaN             
26              03/01/2011 12:21             
27              03/01/2011 12:21              
28              03/01/2011 12:22             
29                           NaN             
...                          ...              
210199          02/01/2013 12:57          
210200          02/01/2013 11:09            
210201          02/01/2013 12:51            
210202          02/01/2013 13:15           
210203          02/01/2013 12:57             
210204          02/01/2013 12:57           
210205          02/01/2013 12:56           
210206          02/01/2013 12:56           
210207          02/01/2013 16:35         
210208          02/01/2013 18:19              
210209          02/01/2013 19:21              
210210          02/01/2013 12:56             
210211          02/01/2013 14:25            
210212          02/01/2013 16:08           
210213          02/01/2013 12:55          
210214          02/01/2013 12:56           
210215          02/01/2013 18:19            
210216          02/01/2013 13:01           
210217          02/01/2013 16:18           
210218          02/01/2013 17:17          
210219          02/01/2013 13:02           
210220          02/01/2013 13:00           
210221          02/01/2013 17:29        
210222          02/01/2013 16:21         
210223          02/01/2013 15:27        
210224          02/01/2013 15:15  
210225          02/01/2013 13:10
210226          02/01/2013 14:49
210227          02/01/2013 14:51 
210228          02/01/2013 14:58

非常感谢您解决此错误消息的任何帮助。

非常感谢你。

1 个答案:

答案 0 :(得分:0)

试试这个

dates = pd.date_range('02/01/2011', '02/01/2013', freq='D')

df['loan_verifieddate'] = pd.to_datetime(df['loan_verifieddate'])

df['loan_verifieddate'].fillna(pd.Series(np.random.choice(dates, size=len(df))), inplace = True)