如果两个日期之间的间隔大于某个特定条件,则停止对熊猫进行重新采样

时间:2018-06-24 17:13:32

标签: python pandas dataframe

我有这个数据框。我的数据框包含ID,时间和值以及缺口(小时)。我正在独立于每个ID进行重新采样。间隔列给出两个不同时间之间的连续时间间隔。我每10分钟进行一次重新采样,并且如果连续的间隙大于0.86 Hr,我想停止重新采样,并返回下一行作为原始行,并在发现相同条件时再次继续重新采样。 我的空缺条件就是这样

ManyToOne

样本数据

package com.example.mrfrag.firebattery;

import android.content.BroadcastReceiver;
import android.content.ContentProvider;
import android.content.Context;
import android.content.Intent;
import android.content.IntentFilter;
import android.os.BatteryManager;
import android.support.v7.app.AppCompatActivity;
import android.os.Bundle;
import android.view.View;
import android.widget.Button;
import android.widget.TextView;
import android.widget.Toast;

public class HomeActivity extends AppCompatActivity{

TextView batteryplus,temp,setting,plug,charging,battery;
private BroadcastReceiver mReceiver;
IntentFilter ifilter =new IntentFilter();

@Override
protected void onCreate(Bundle savedInstanceState) {

    super.onCreate(savedInstanceState);
    setContentView(R.layout.activity_home);
    batteryplus=  findViewById(R.id.batteryplus);
    temp =  findViewById(R.id.temp);
    setting=  findViewById(R.id.setting);
    plug=  findViewById(R.id.plug);
    charging=  findViewById(R.id.charging);
    battery=  findViewById(R.id.battery);
    mReceiver=new PowerConnectionReceiver();
    ifilter.addAction(Intent.ACTION_POWER_CONNECTED);
    ifilter.addAction(Intent.ACTION_POWER_DISCONNECTED);
    ifilter.addAction(Intent.ACTION_BATTERY_CHANGED);

}

@Override
protected void onResume() {
    super.onResume();
    registerReceiver(mReceiver, ifilter);
}

@Override
protected void onPause() {
    super.onPause();
    unregisterReceiver(mReceiver);
}

public class PowerConnectionReceiver extends BroadcastReceiver {
    public PowerConnectionReceiver(){

    }
    @Override
    public void onReceive(Context context, Intent batteryStatus) {
        int status = batteryStatus.getIntExtra(BatteryManager.EXTRA_STATUS, 
-1);

        boolean isCharging = status == BatteryManager.BATTERY_STATUS_CHARGING 
 ||
                status == BatteryManager.BATTERY_STATUS_FULL;
        if (isCharging) {

            plug.setText("CHARGING");
        } else {

            plug.setText("NOT CHARGING");

        }
        int  health = 
    batteryStatus.getIntExtra(BatteryManager.EXTRA_HEALTH,0);
        batteryplus.setText(ConvoHealth(health));
        String  technology = 
    batteryStatus.getExtras().getString(BatteryManager.EXTRA_TECHNOLOGY);
        setting.setText(technology);
        float  tpr   = ((float) 
 batteryStatus.getIntExtra(BatteryManager.EXTRA_TEMPERATURE,0)) / 10.0f;
        temp.setText( String.valueOf(tpr) + "°C / " + 
 String.valueOf(tpr*1.8+32)+"F"+"\n");
        int  voltage = 
  batteryStatus.getIntExtra(BatteryManager.EXTRA_VOLTAGE,0);
        charging.setText(voltage+ " mV");
        int  mah = batteryStatus.getIntExtra(String.valueOf
   (BatteryManager.BATTERY_PROPERTY_CAPACITY),0);
        battery.setText(mah+ " Mah");

    }
}

private String ConvoHealth(int health){
    String result;
    switch(health){
        case BatteryManager.BATTERY_HEALTH_COLD:
            result = "COLD";
            break;
        case BatteryManager.BATTERY_HEALTH_DEAD:
            result = "DEAD";
            break;
        case BatteryManager.BATTERY_HEALTH_GOOD:
            result = "GOOD";
            break;
        case BatteryManager.BATTERY_HEALTH_OVERHEAT:
            result = "OVERHEAT";
            break;
        case BatteryManager.BATTERY_HEALTH_OVER_VOLTAGE:
            result = "OVER VOLTAGE";
            break;
        case BatteryManager.BATTERY_HEALTH_UNKNOWN:
            result = "UNKNOWN";
            break;
        case BatteryManager.BATTERY_HEALTH_UNSPECIFIED_FAILURE:
            result = "UNSPECIFIED_FAILURE";
            break;
        default:
            result = "Unknown";
    }
     return result;
 }

}

如您所见,ID 1的间隙超过0.86 Hr,所以我的想法是在该点停止重新采样。 像这样

a (abs(a-b))
b  0

因此,我想继续对此ID,Time,Value,Gaps 1,1523147332607,2,0.3347541666666667 1,1523148537722,5,0.17346666666666666 1,1523149162202,6,1.6252830555555555 1,1523155013221,4,0.33290027777777775 1,1523156211662,7,0.3722580555555556 1,1523157551791,10,0.0 2,1523156211662,5,0.5115911111111111 2,1523158053390,2,0.3405525 2,1523159279379,9,1.3295477777777778 2,1523164065751,3,0.0 进行重新采样,并且当没有更多采样要做时,我想将最后一行作为原始结果返回,即

ID,Time,Value,Gaps
1,1523147332607,2,0.3347541666666667
...................................
1,1523148537722,5,0.17346666666666666
...................................
...................................
1,1523149162202,6,1.6252830555555555

然后,我想从下一行继续重新采样

Time period 1523149162202

然后继续

对于每个ID的正常重采样,

1,1523149162202,6,1.6252830555555555

但是如何跟踪每个重采样,以便在遇到某些条件时可以停止重采样,并在该部分重采样的末尾返回原始行。然后,它从下一行再次继续相同的条件。我本来打算使用1,1523155013221,4,0.33290027777777775 1,1523156211662,7,0.3722580555555556 1,1523157551791,10,0.0 ,但距离实现这一目标还遥遥无期。 有什么建议么 ?

1 个答案:

答案 0 :(得分:2)

一种方法可能是在df中创建一个临时列'ID_res',在更改ID时或在间隙超过0.86之后的行中递增数字,例如:

df.loc[(df['ID'] != df['ID'].shift())| (df['Gaps'].shift() > 0.86),'ID_res'] = 1
df['ID_res'] = df['ID_res'].cumsum().ffill()

所以您的df如下:

                         ID  Value      Gaps  ID_res
Time                                                
2018-04-08 00:28:52.607   1      2  0.334754     1.0
2018-04-08 00:48:57.722   1      5  0.173467     1.0
2018-04-08 00:59:22.202   1      6  1.625283     1.0
2018-04-08 02:36:53.221   1      4  0.332900     2.0
2018-04-08 02:56:51.662   1      7  0.372258     2.0
2018-04-08 03:19:11.791   1     10  0.000000     2.0
2018-04-08 02:56:51.662   2      5  0.511591     3.0
2018-04-08 03:27:33.390   2      2  0.340553     3.0
2018-04-08 03:47:59.379   2      9  1.329548     3.0
2018-04-08 05:07:45.751   2      3  0.000000     4.0

现在,您可以使用在“ ID_res”上进行第一个分组依据的方法,同时保留“ ID”和“ Value”列,并在最后删除“ ID_res”列,因为您不再需要它:

df = (df.groupby('ID_res', axis=0)['ID','Value'].resample('10min').mean()
        .groupby(level=0).apply(lambda x: x.interpolate())
        .reset_index().drop('ID_res',1))

结果如下:

                  Time   ID  Value
0  2018-04-08 00:20:00  1.0    2.0
1  2018-04-08 00:30:00  1.0    3.5
2  2018-04-08 00:40:00  1.0    5.0
3  2018-04-08 00:50:00  1.0    6.0
4  2018-04-08 02:30:00  1.0    4.0
5  2018-04-08 02:40:00  1.0    5.5
6  2018-04-08 02:50:00  1.0    7.0
...

在第3行和第4行之间没有重新采样的情况下,原始df的这些值之间的“间隙”超过0.86