Question

包含奇怪格式的日期/时间的日志文件让我保持清醒我没有想要对使用过的算法进行逆向工程。使用来自www.digital-detective.co.uk的DCode也无效。我所知道的：这是一个64位的值，它始于1900年1月1日。一些例子：

日期/时间：十六进制表示：
      1900-01-01 00:00:00＆gt; 0000000000000000
      2006-07-19 00:00:00＆gt; 0000000000A000E3
      2008-04-14 00:00:00＆gt; 00000000000050E3
      2008-04-15 11:04:32＆gt; 00D6CF74C42E50E3
      2008-04-15 11:04:46＆gt; 00CEA0C8C52E50E3
      2008-04-15 11:08:32＆gt; 00EC3B36DB2E50E3
      2008-04-16 11:08:43＆gt; 008B3B41DC4E50E3
      2008-04-16 11:21:02＆gt; 00B3AD52224F50E3
      2012-02-21 00:00:00＆gt; 00000000000000E4
      2012-03-13 13:37:54＆gt; 007A35F12CB202E4
      2012-10-22 16:27:13＆gt; 001F7A2AF0951EE4

知道如何将给定值转换为重新获得睡眠的日期/时间吗？

Answer 1

I can tell you for sure that the value either needs to swap endianness or be reversed. You can see all the dates except for the EPOCH ending in E3 and E4, where most of the other digits change a ton. You can also see times on 00:00:00 have flat zeroes across the left side, which further validates this claim.

I did some quick math, though, and it doesn't seem to be in terms of seconds, nanoseconds, microseconds, or anything else. There's definitely a linear correlation between the values, though. What I would do from here is generate a batch of times that are a few seconds apart and see how the value changes. This will help confirm if you need to reverse the digits or flip the endianness, as well as might hint in what relation the value is to a second.

EDIT: it's definitely an endianness flip, not a string reversal. If it was a reversal, it would have went E3->F3, not E3->E4. doh.

From here, I would generate times 100 seconds, 1000 seconds, and 10000 seconds after the EPOCH. This may give you an extra hint.

Answer 2

在Nick Cano的回答基础上，看起来十六进制对需要从右到左阅读。您可以通过获取将十六进制值转换为日期以下步骤：

反转十六进制值
将十六进制值转换为字节
字节交换：将字节解压缩为大端，然后重新打包为字节为小端
将十六进制值转换为整数Int。
计算Timestamp = A*Int + B，其中A = 2.45563569478691e-09和B = -39014179200.000061
将Timestamp（自纪元1970-01-01以来的秒数）转换为Date。

将这些步骤应用于发布的数据时，结果如下：

              Datetime               Hex              RHex              Flip  \
0  1900-01-01 00:00:00  0000000000000000  0000000000000000  0000000000000000   
1  2006-07-19 00:00:00  0000000000A000E3  3E000A0000000000  E300A00000000000   
2  2008-04-14 00:00:00  00000000000050E3  3E05000000000000  E350000000000000   
3  2008-04-15 11:04:32  00D6CF74C42E50E3  3E05E24C47FC6D00  E3502EC474CFD600   
4  2008-04-15 11:04:46  00CEA0C8C52E50E3  3E05E25C8C0AEC00  E3502EC5C8A0CE00   
5  2008-04-15 11:08:32  00EC3B36DB2E50E3  3E05E2BD63B3CE00  E3502EDB363BEC00   
6  2008-04-16 11:08:43  008B3B41DC4E50E3  3E05E4CD14B3B800  E3504EDC413B8B00   
7  2008-04-16 11:21:02  00B3AD52224F50E3  3E05F42225DA3B00  E3504F2252ADB300   
8  2012-02-21 00:00:00  00000000000000E4  4E00000000000000  E400000000000000   
9  2012-03-13 13:37:54  007A35F12CB202E4  4E202BC21F53A700  E402B22CF1357A00   
10 2012-10-22 16:27:13  001F7A2AF0951EE4  4EE1590FA2A7F100  E41E95F02A7A1F00   

                     Int    Timestamp                 Date  
0                      0 -39014179200  0733-09-10 00:00:00  
1   16357249768470085632   1153267200  2006-07-19 00:00:00  
2   16379591844746493952   1208131200  2008-04-14 00:00:00  
3   16379643266054739456   1208257472  2008-04-15 11:04:32  
4   16379643271755910656   1208257486  2008-04-15 11:04:46  
5   16379643363789106176   1208257712  2008-04-15 11:08:32  
6   16379678552640686848   1208344123  2008-04-16 11:08:43  
7   16379678853581091584   1208344862  2008-04-16 11:21:02  
8   16429131440647569408   1329782400  2012-02-21 00:00:00  
9   16429890296696109568   1331645874  2012-03-13 13:37:54  
10  16437740548686225152   1350923233  2012-10-22 16:27:13

请注意，计算的（最后一个）Date列与给定的（第一个）Datetime匹配除1900-01-01 00:00:00之外的列。我猜是插入了MySQL 1900-01-01 00:00:00作为无效日期的默认值。

通过对数据应用线性回归找到了神奇的常数A和B.给定Datetime s，可以计算出相应的Timestamp。然后通过拟合Timestamp s与Int s中的最佳拟合线找到A和B.

以下是我用来探索问题并生成上表的python代码：

import pandas as pd
import numpy as np
import struct
import binascii

df = pd.read_table('data', sep='\s{2,}', parse_dates=[0])

df['RHex'] = df['Hex'].str[::-1]

def flip_endian(x):
    xbytes = binascii.unhexlify(x)
    swapped = struct.pack('<8h', *struct.unpack('>8h', x))
    return swapped
df['Flip'] = df['RHex'].apply(flip_endian)

df['Int'] = df['Flip'].apply(lambda x: int(x, 16))

# The constants were found by linear regression:
# import scipy.stats as stats
# df['Timestamp'] = (df['Datetime'] - pd.Timestamp('1970-1-1')).dt.total_seconds()
# A, B, rval, pval, stderr = stats.linregress(df['Int'], df['Timestamp'])
A, B = 2.45563569478691e-09, -39014179200.000061

df['Timestamp'] = (A*df['Int'] + B).astype(int)
df['Date'] = (np.array(['1970-01-01'], dtype='datetime64[D]') 
              + np.array(df['Timestamp'], dtype='timedelta64[s]')).tolist()

print(df)

复杂的未知日期/时间表示

2 个答案: