将逻辑更改为按某些值分组以在组

时间:2019-10-02 03:21:00

标签: python-3.x pandas pandas-groupby

如果忽略站点值,则此代码有效。我已将逻辑从仅处理1个站点的旧系统转换为Python。现在,它可以在Python中运行,但需要在每个特定站点上运行。

这显然不是完整的代码,只是受影响的列和站点逻辑。

基本上,代码会更改“ real_value”并平滑其值,以使“ smoothed_value”完全没有负数。

当所有站点同时位于数据框中时,我不知道如何将逻辑应用于单个站点。

我不确定从何处开始按站点分组。我已经尝试了一些方法,但是注释的代码(在底部)是我所拥有的最接近的代码,但看起来仍然不合适。我知道必须更改主FOR循环,但是我不确定如何更改。

import pandas as pd

Storage=pd.DataFrame()

Storage['site'] = [
'A','A','A','A','A','A','C','C','C','C',
'D','D','D','D','D','D','D','D','D','D',
'B','B','B','B','B','B','B','B','B','B']

Storage['real_value'] = [
2593.769191,2770.73389599994,6514.75004600001,6158.58129200005,2440.53634399994,
-136.246671999981,455.359255999961,-122.125297999993,9456.82494400006,18282.550165,
8913.47572500005,524.379928000032,928.181714999916,926.490542000033,407.473650000021,
1883.205675,-13405.49748,18178.816992,-7543.11027599997,644.578617999975,
168.571999999974,138.258188000032,217.615295999974,3718.12751199997,9250.91240000011,
7102.1812419999,1376.81937600004,406.029493999961,415.965640000007,3439.79298800006]

# Smooth the data in order to remove negative values
Storage['smoothed_value']=Storage['real_value']

Storage=Storage.sort_values('site')

Storage.reset_index(inplace=True)
Storage=Storage.drop(columns=['index'])

#print (Storage.shape[0])

for i in range(0,Storage.shape[0]):
    j = 0
    sum = Storage['smoothed_value'].get_values()[i]
    #print(str(i) + ": " + str(sum))
    leftlim = i
    rightlim = i
    while (sum < 0):
        j = j + 1
        if (i - j <= 0):
            #print ("if")
            #       Smoothing window limited by start of data set
            leftlim = 0
            rightlim = i + j
            sum = sum + Storage.at[rightlim,'smoothed_value']
        elif (i + j > Storage.shape[0]):
            #print ("elif")
            #       Smoothing window limited by end of data set
            leftlim = i - j
            rightlim = Storage.shape[0]
            sum = sum + Storage.at[leftlim,'smoothed_value']
        else:
            #print ("else")
            #       Smoothing window not limited by data set length
            leftlim = i - j
            rightlim = i + j

            sum = sum + Storage.at[leftlim,'smoothed_value'] + Storage.at[rightlim,'smoothed_value']
        #print("bottom of while: " + str(sum))
    for k in range (leftlim,rightlim+1):
        #print ("k: " + str(k) + " leftlim: " + str(leftlim) + " rightlim: " + str(rightlim))
        Storage.at[k,'smoothed_value'] = sum / (rightlim - leftlim + 1)

print (Storage.to_string())
print (Storage['smoothed_value'].sum() , Storage['smoothed_value'].sum(), '\n\n\n')

#for i,g in Storage.groupby('site'):
#    print (i, g)
#    print (Storage['smoothed_value'].sum() , Storage['smoothed_value'].sum(), '\n\n\n')

我希望平滑计算一次可用于单个站点,因此在代码末尾,该站点没有负值,并且real_values可以更改为测试。

此生产版本可以包含数十个站点和数百个real_value(已经按日期顺序排序),并且必须实现相同的输出。

0 个答案:

没有答案