我想在python中使用pandas替换电子表格中的某些时间戳值。
电子表格A每33毫秒(在时间戳列中)有时间戳记值。此外,它(in)方便地跳过第二个" S.000"。
的每个顶部SPREADSHEET A
时间戳|位置1
5/4/2017 10:00:00.000 | 0.005430355
5/4/2017 10:00:00.033 | 0.004475154
5/4/2017 10:00:00.066 | 0.004958829
5/4/2017 10:00:00.099 | 0.010678668
5/4/2017 10:00:00.133 | 0.014313085
5/4/2017 10:00:00.166 | 0.004182263
5/4/2017 10:00:00.199 | 0.00232128
5/4/2017 10:00:00.233 | 0.004263132
5/4/2017 10:00:00.266 | 0.007513777
。
。
5/4/2017 10:00:00.999 | 0.011229943 #always跳过第二个" S.000"
5/4/2017 10:00:01.033 | 0.016148495 #always跳过第二个" S.000"
5/4/2017 10:00:01.066 | 0.009239103
5/4/2017 10:00:01.099 | 0.015364848
5/4/2017 10:00:01.133 | 0.032139104
5/4/2017 10:00:01.166 | 0.023679454
5/4/2017 10:00:01.199 | 0.002503840
如何使用pandas来舍入每个具有" X.999"到最近的第二个?我想
see if a value is "X.999" in the to "Time Stamp" column
if it is "X.999":
change it to "X+1.000" / or round it up
else
leave the time stamp as it is
然后继续重复上述步骤,直到我为新数据框创建新的设置时间戳。或者甚至更好,用新的时间戳替换当前的数据帧时间戳。
答案 0 :(得分:0)
到目前为止,这是我的解决方案......花了很长时间,但它现在有效。 考虑微秒值为9700000(970毫秒)或更高的时间戳,将其差异从1秒加到自身,然后用这个新总和替换时间戳(总是在X.000秒)。
这是代码。
%matplotlib inline
import seaborn as sns
import matplotlib as mp
import pandas as pd
import numpy as np
import datetime
dfa=pd.read_excel("may2017.xlsx",sheetname=0)
dfb=pd.read_excel("logs.xlsx",sheetname=0, skip_rows=746243)
dfb = b.iloc[746244:754431] #got the index value range from trial and error
a = dfa
b = dfb
adummy = a
bdummy = b
for i in range(len(adummy.index)): #get the length of all rows
#find the difference between each time stamp from 1 second
dif = 1000000 - adummy.iloc[i]["Time Stamp"].microsecond
#change microseconds to timedelta object for later
tdelta = datetime.timedelta(microseconds=dif)
#now check if this row timestamp is near a 1-second interval
if adummy.iloc[i]["Time Stamp"].microsecond > 975000:
#if near 1 second, 'round up' to the next second by adding tdelta
tnew = tdelta + adummy.iloc[i]["Time Stamp"]
#replace the time stamp in this row with the nearest second above
adummy["Time Stamp"] = adummy["Time Stamp"].replace(adummy.iloc[i]["Time Stamp"], tnew)
#print this new time to verify the dataframe got updated
print(adummy.iloc[i]["Time Stamp"])
else:
#else if the timestamp is not near 1 from the low side, do nothing
pass
#Restart the loop
输出:
2017-05-04 10:00:01
2017-05-04 10:00:02
2017-05-04 10:00:03
2017-05-04 10:00:04
2017-05-04 10:00:05
2017-05-04 10:00:06
2017-05-04 10:00:07