如何从两列中的数据中提取出一个模式

时间:2017-01-26 15:19:56

标签: python pandas

我希望找出Power_1发生后Power_2列发生的情况。

我有一个数据框(类似于下面的数据框),它包含两个电气负载Power_1Power_2的数据,我试图了解Power_2的转换频率在Power_1开启后开启。我的实际数据包含的行多于下面的数据框,但我想说明我的数据是什么样的。

基本上,我试图了解是否存在模式,或者在Power_2之后Power_1启用Power_2时是否可视化。

我想测试的假设是,如果在Power_1之后启用Power 2,则表示存在某种模式。如果在Power_1之后未启用TimeStamp Power_1 Power_2 6:00:00 0 0 6:00:01 0 0 6:00:02 0.538906412 0 6:00:03 0.230903467 0 6:00:04 0 0.002241299 6:00:05 0 0.971594583 6:00:06 0 0 6:00:07 0 0 6:00:08 0 0 6:00:09 0.898974742 0 6:00:10 0.266201046 0 6:00:11 0 0.752396849 6:00:12 0 0.662316668 6:00:13 0 0.721062372 6:00:14 0 0 6:00:15 0 0.344280835 6:00:16 0.149564236 0 6:00:17 0.5211515 0 6:00:18 0.957654133 0 6:00:19 0 0 6:00:20 0 0 ,则表示用户正在执行不同的操作。

JHtml::_('behavior.tooltip');

3 个答案:

答案 0 :(得分:1)

我将您的数据映射到新结构:时间戳+ PowerSource,其中PowerSource可以是0,1或2。

TimeStamp   PowerSource
6:00:00    0
6:00:01    0
6:00:02    1
6:00:03    1
6:00:04    2
6:00:05    2
6:00:06    0
6:00:07    0
6:00:08    0
6:00:09    1
6:00:10    1
6:00:11    2
6:00:12    2
6:00:13    2
6:00:14    0
6:00:15    2
6:00:16    1
6:00:17    1
6:00:18    1
6:00:19    0
6:00:20    0

然后遍历新结构,检查电源的变化,仅关注1和2。

var array = [all of your data];
var lastPowerSource = 0;
for (var i = 0; i < array.length; i++) {
  var value = array[i];
  var time = value.timeStamp;
  var powerSource = value.powerSource;
  if (lastPowerSource != 0) {
    if (powerSource != lastPowerSource) {
      console.log("Power source changed from " + lastPowerSource + " to " + powerSource + " at " + time);      
    }
  }
  lastPowerSource = powerSource;
}

这会给你类似的东西:

Power source changed from 1 to 2 at 6:00:04
Power source changed from 1 to 2 at 6:00:11
Power source changed from 2 to 1 at 6:00:16

显然这只是伪代码(实际上是javascript),你可以在lastPowerSource为2时添加一个特定的检查,如果需要,可以添加powerSource。

答案 1 :(得分:1)

我认为如果第1列大于零且第2列大于零,则下面的代码将为您提供模式:

a['pattern']=(a['Power_1']>0) & (a['Power_2']>0)

答案 2 :(得分:1)

你的问题不是很清楚,我还是假设当电量大于0时,电源开启。另外我认为df总是由TimeStamp列

排序
import pandas as pd

Power1=False
Power2=False
PreviousPower='None'
grpcontPowerone=[]
grpcontPowertwo=[]
grpPoweronetotwo=[]

for index, row in df.iterrows():
    if row['Power_1']>0 and PreviousPower=='Power_1':
        grpcontPowerone.extend([index-1,index])
    elif row['Power_2']>0 and PreviousPower=='Power_2':
        grpcontPowertwo.extend([index-1,index])
    elif row['Power_2']>0 and PreviousPower=='Power_1':
        grpPoweronetotwo.extend([index-1,index])

    if row['Power_1']>0:
        PreviousPower='Power_1'
    elif row['Power_2']>0:
        PreviousPower='Power_2'
    else:
        PreviousPower='None'

print "When power 1 is contiously turned on"
print df.iloc[list(set(grpcontPowerone))].sort_values('TimeStamp')

print "When power 2 is contiously turned on"
print df.iloc[list(set(grpcontPowertwo))].sort_values('TimeStamp')  

print "Power is switched from one to two"
print df.iloc[list(set(grpPoweronetotwo))].sort_values('TimeStamp')  

输出

When power 1 is contiously turned on
   TimeStamp   Power_1  Power_2
2    6:00:02  0.538906      0.0
3    6:00:03  0.230903      0.0
9    6:00:09  0.898975      0.0
10   6:00:10  0.266201      0.0
16   6:00:16  0.149564      0.0
17   6:00:17  0.521151      0.0
18   6:00:18  0.957654      0.0
When power 2 is contiously turned on
   TimeStamp  Power_1   Power_2
4    6:00:04      0.0  0.002241
5    6:00:05      0.0  0.971595
11   6:00:11      0.0  0.752397
12   6:00:12      0.0  0.662317
13   6:00:13      0.0  0.721062
Power is switched from one to two
   TimeStamp   Power_1   Power_2
3    6:00:03  0.230903  0.000000
4    6:00:04  0.000000  0.002241
10   6:00:10  0.266201  0.000000
11   6:00:11  0.000000  0.752397