我处理时间序列并尝试编写函数来计算月平均数据。以下是准备的一些功能:
import datetime
import numpy as numpy
def date_range_0(start,end):
dates = [start + datetime.timedelta(days=i)
for i in range((end-start).days+1)]
return numpy.array(dates)
def date_range_1(start,days):
#days should be an interger
return date_range_0(start,start+datetime.timedelta(days-1))
x=date_range_1(datetime.datetime(2015, 5, 17),4)
x,输出是一个简单的时间列表:
array([datetime.datetime(2015, 5, 17, 0, 0),
datetime.datetime(2015, 5, 18, 0, 0),
datetime.datetime(2015, 5, 19, 0, 0),
datetime.datetime(2015, 5, 20, 0, 0)], dtype=object)
然后我从http://blog.csdn.net/youngbit007/article/details/54288603学习groupby函数 我在上面的网站上尝试了一个例子,它运行良好:
df = pandas.DataFrame({'key1':date_range_1(datetime.datetime(2015, 1, 17),5),
'key2': [2015001,2015001,2015001,2015001,2015001],
'data1': 1+0.1*numpy.arange(1,6)
})
df
给出
data1 key1 key2
0 1.1 2015-01-17 2015001
1 1.2 2015-01-18 2015001
2 1.3 2015-01-19 2015001
3 1.4 2015-01-20 2015001
4 1.5 2015-01-21 2015001
和
grouped=df['data1'].groupby(df['key2'])
grouped.mean()
给出
key2
2015001 0.2
Name: data1, dtype: float64
然后我尝试自己的例子:
datedat=numpy.array([date_range_1(datetime.datetime(2015, 1, 17),5),1+0.1*numpy.arange(1,6)]).T
months = [day.month for day in datedat[:,0]]
years = [day.year for day in datedat[:,0]]
datedatF =
pandas.DataFrame({'key1':datedat[:,0],'key2':list((numpy.array(years)*1000 +numpy.array(months))),'data1':datedat[:,1]})
datedatF
生成
data1 key1 key2
0 1.1 2015-01-17 2015001
1 1.2 2015-01-18 2015001
2 1.3 2015-01-19 2015001
3 1.4 2015-01-20 2015001
4 1.5 2015-01-21 2015001
请注意,这与上面完全相同!到现在为止还挺好。然后我跑:
grouped2=datedatF['data1'].groupby(datedatF['key2'])
grouped2.mean()
它抛弃了这个:
---------------------------------------------------------------------------
DataError Traceback (most recent call last)
<ipython-input-170-f0d2bc225b88> in <module>()
1 grouped2=datedatF['data1'].groupby(datedatF['key2'])
----> 2 grouped2.mean()
/root/anaconda3/lib/python3.6/site-packages/pandas/core/groupby.py in mean(self, *args, **kwargs)
1017 nv.validate_groupby_func('mean', args, kwargs)
1018 try:
-> 1019 return self._cython_agg_general('mean')
1020 except GroupByError:
1021 raise
/root/anaconda3/lib/python3.6/site-packages/pandas/core/groupby.py in _cython_agg_general(self, how, numeric_only)
806
807 if len(output) == 0:
--> 808 raise DataError('No numeric types to aggregate')
809
810 return self._wrap_aggregated_output(output, names)
DataError: No numeric types to aggregate
哦,我错了什么?为什么我不是指第二个pandas.DataFrame?它与成功的例子完全相同!
答案 0 :(得分:4)
您的df中的data1类型是对象,我们需要添加pd.to_numeric
datedatF.dtypes
Out[39]:
data1 object
key1 datetime64[ns]
key2 int64
dtype: object
grouped2=pd.to_numeric(datedatF['data1']).groupby(datedatF['key2'])
grouped2.mean()
Out[41]:
key2
2015001 1.3
Name: data1, dtype: float64
答案 1 :(得分:3)
您的SELECT
DISTINCT
T0.project_number_ext as ProjectNumber
,T0.status_desc as SDesc
,T0.Project_name as PName
From trimergo.rpt_getProjectPOC T0
WHERE T0.sproject_number IS NULL AND
( T1.SDesc LIKE '10 - In Proposal%' OR T1.SDesc like '90-Lost Opportunity%' )
属于data1
(字符串)dtype:
object
所以试试这个:
In [396]: datedatF.dtypes
Out[396]:
data1 object # <--- NOTE!
key1 datetime64[ns]
key2 int64
dtype: object
答案 2 :(得分:0)
<?xml version="1.0" encoding="utf-8"?>
<PreferenceScreen xmlns:android="http://schemas.android.com/apk/res/android"
app:iconSpaceReserved="false"
xmlns:app="http://schemas.android.com/apk/res-auto">
<CheckBoxPreference
app:iconSpaceReserved="false"
android:defaultValue="false"
android:key="key1"
android:title="title1" />
<SwitchPreference
app:iconSpaceReserved="false"
android:defaultValue="false"
android:key="@string/pref_wakeup"
android:title="key2" />
</PreferenceScreen>