DataFrame,其中Date为日期时间:
Column | Date
:-----------|----------------------:
A | 2018-08-05 17:06:01
A | 2018-08-05 17:06:02
A | 2018-08-05 17:06:03
B | 2018-08-05 17:06:07
B | 2018-08-05 17:06:09
B | 2018-08-05 17:06:11
返回表为;
Column | Date
:-----------|----------------------:
A | 2018-08-05 17:06:02
B | 2018-08-05 17:06:09
答案 0 :(得分:1)
准备示例数据框:
# Initiate dataframe
date_var = "date"
df = pd.DataFrame(data=[['A', '2018-08-05 17:06:01'],
['A', '2018-08-05 17:06:02'],
['A', '2018-08-05 17:06:03'],
['B', '2018-08-05 17:06:07'],
['B', '2018-08-05 17:06:09'],
['B', '2018-08-05 17:06:11']],
columns=['column', date_var])
# Convert date-column to proper pandas Datetime-values/pd.Timestamps
df[date_var] = pd.to_datetime(df[date_var])
提取所需的平均时间戳值:
# Extract the numeric value associated to each timestamp (epoch time)
# NOTE: this is being accomplished via accessing the .value - attribute of each Timestamp in the column
In:
[tsp.value for tsp in df[date_var]]
Out:
[
1533488761000000000, 1533488762000000000, 1533488763000000000,
1533488767000000000, 1533488769000000000, 1533488771000000000
]
# Use this to calculate the mean, then convert the result back to a timestamp
In:
pd.Timestamp(np.nanmean([tsp.value for tsp in df[date_var]]))
Out:
Timestamp('2018-08-05 17:06:05.500000')
答案 1 :(得分:0)
例如。
您的数据:
df = pd.DataFrame(data=[['A', '2018-08-05 17:06:01'],
['A', '2018-08-05 17:06:02'],
['A', '2018-08-05 17:06:03'],
['B', '2018-08-05 17:06:07'],
['B', '2018-08-05 17:06:09'],
['B', '2018-08-05 17:06:11']],
columns = ['column', 'date'])
解决方案:
df.date = pd.to_datetime(df.date).values.astype(np.int64)
df = pd.DataFrame(pd.to_datetime(df.groupby('column').mean().date))
输出:
date
column
A 2018-08-05 17:06:02
B 2018-08-05 17:06:09
我希望这会有所帮助。