在我的数据框中,有一个特定的列,其中包含日期和时间。在同一列中是NaN(空白行),我想将这些空行仅转换为列中已有的日期和时间,而不是实际日期。
有没有办法让python从日期范围中随机选择?我希望这个日期范围是从02/01/2011 - 02/01/2013(或尽可能接近)和随机时间。
由于上述问题得到解答(下面的代码),出现了一个新问题,我已粘贴在初始问题的代码之下;
dates = pd.date_range('02/01/2011', periods=len(file1), freq='D').values
np.random.shuffle(dates)
file1.loan_verifieddate = file1.loan_verifieddate.fillna(pd.Series(dates, index=file1.index))
我也使用了以下内容;
dates = pd.date_range('02/01/2011', '02/01/2013', freq='D')
file1['loan_verifieddate'] = pd.to_datetime(file1['loan_verifieddate'])
file1['loan_verifieddate'].fillna(pd.Series(np.random.choice(dates, size=len(file1))), inplace = True)
以下是使用上述代码时出现的错误;
C:\Users\ishore\AppData\Roaming\Python\Python36\site-packages\pandas\core\indexes\datetimes.py:1973: RuntimeWarning: overflow
encountered in longlong_scalars
e = b + np.int64(periods) * stride
Traceback (most recent call last):
File "C:/Users/ishore/Documents/Custom Office Templates/Pycharm Projects/Task1.py", line 18, in <module>
file1.loan_verifieddate = file1.loan_verifieddate.fillna(pd.Series(dates, index=file1.index))
File "C:\Users\ishore\AppData\Roaming\Python\Python36\site-packages\pandas\core\series.py", line 250, in __init__
data = SingleBlockManager(data, index, fastpath=True)
File "C:\Users\ishore\AppData\Roaming\Python\Python36\site-packages\pandas\core\internals.py", line 4117, in __init__
fastpath=True)
File "C:\Users\ishore\AppData\Roaming\Python\Python36\site-packages\pandas\core\internals.py", line 2719, in make_block
return klass(values, ndim=ndim, fastpath=fastpath, placement=placement)
File "C:\Users\ishore\AppData\Roaming\Python\Python36\site-packages\pandas\core\internals.py", line 2224, in __init__
placement=placement, **kwargs)
File "C:\Users\ishore\AppData\Roaming\Python\Python36\site-packages\pandas\core\internals.py", line 115, in __init__
len(self.mgr_locs)))
ValueError: Wrong number of items passed 0, placement implies 210229
以下是我在上面引用的专栏;
loan_verifieddate
0 NaN
1 02/01/2011 10:55
2 02/01/2011 10:55
3 NaN
4 NaN
5 02/01/2011 08:38
6 02/01/2011 08:38
7 02/01/2011 08:38
8 NaN
9 NaN
10 02/01/2011 08:38
11 02/01/2011 08:38
12 NaN
13 NaN
14 NaN
15 NaN
16 NaN
17 NaN
18 NaN
19 NaN
20 NaN
21 NaN
22 NaN
23 NaN
24 NaN
25 NaN
26 03/01/2011 12:21
27 03/01/2011 12:21
28 03/01/2011 12:22
29 NaN
... ...
210199 02/01/2013 12:57
210200 02/01/2013 11:09
210201 02/01/2013 12:51
210202 02/01/2013 13:15
210203 02/01/2013 12:57
210204 02/01/2013 12:57
210205 02/01/2013 12:56
210206 02/01/2013 12:56
210207 02/01/2013 16:35
210208 02/01/2013 18:19
210209 02/01/2013 19:21
210210 02/01/2013 12:56
210211 02/01/2013 14:25
210212 02/01/2013 16:08
210213 02/01/2013 12:55
210214 02/01/2013 12:56
210215 02/01/2013 18:19
210216 02/01/2013 13:01
210217 02/01/2013 16:18
210218 02/01/2013 17:17
210219 02/01/2013 13:02
210220 02/01/2013 13:00
210221 02/01/2013 17:29
210222 02/01/2013 16:21
210223 02/01/2013 15:27
210224 02/01/2013 15:15
210225 02/01/2013 13:10
210226 02/01/2013 14:49
210227 02/01/2013 14:51
210228 02/01/2013 14:58
非常感谢您解决此错误消息的任何帮助。
非常感谢你。
磅
答案 0 :(得分:0)
试试这个
dates = pd.date_range('02/01/2011', '02/01/2013', freq='D')
df['loan_verifieddate'] = pd.to_datetime(df['loan_verifieddate'])
df['loan_verifieddate'].fillna(pd.Series(np.random.choice(dates, size=len(df))), inplace = True)