我似乎无法理解 除非我遗漏了一些东西,否则我似乎遇到的是“datetime”和“import pandas as pd
import datetime as dt
import numpy as np
my_dates = ['2021-02-03','2021-02-05','2020-12-25', '2021-12-27','2021-12-12']
my_numbers = [100,200,0,400,500]
df = pd.DataFrame({'a':my_dates, 'b':my_numbers})
df['a']=pd.to_datetime(df['a')
# ultimate goal is to be able to go. * df.mean() * and be able to see mean DATE
# but this doesn't seem to work so...
df['a'].mean().strftime('%Y-%m-%d') ### ok this works... I can mess around and concat stuff...
# But why won't this work?
df2 = df.select_dtypes('datetime')
df2.mean() # WONT WORK
df2['a'].mean() # WILL WORK?
答案 0 :(得分:0)
您可以尝试在 mean()
方法中传递 numeric_only 参数:
out=df.select_dtypes('datetime').mean(numeric_only=False)
out
的输出:
a 2021-06-03 04:48:00
dtype: datetime64[ns]
注意:如果数据类型是字符串,它会抛出一个错误
答案 1 :(得分:0)
您应用的平均函数在每种情况下都不同。
import pandas as pd
import datetime as dt
import numpy as np
my_dates = ['2021-02-03','2021-02-05','2020-12-25', '2021-12-27','2021-12-12']
my_numbers = [100,200,0,400,500]
df = pd.DataFrame({'a':my_dates, 'b':my_numbers})
df['a']=pd.to_datetime(df['a'])
df.mean()
这个均值函数是 DataFrame 均值函数,它适用于数字数据。要查看谁是数字,请执行以下操作:
df._get_numeric_data()
b
0 100
1 200
2 0
3 400
4 500
但是 df['a'] 是一个日期时间序列。
df['a'].dtype, type(df)
(dtype('<M8[ns]'), pandas.core.frame.DataFrame)
所以 df['a'].mean() 应用不同的均值函数来处理日期时间值。这就是为什么 df['a'].mean() 输出日期时间值的平均值。
df['a'].mean()
Timestamp('2021-06-03 04:48:00')
在此处阅读更多信息: difference-between-data-type-datetime64ns-and-m8ns DataFrame.mean() ignores datetime series #28108